This document is updated from
analyze_temp_serial_transfer_expt--16July24.Rmd.
There are 5 treatments: no heat (5 days serial transfer), 6h heat, 12h heat, 24h heat, and 48h heat. Each of these is setup with 5 technical replicates that was initially inoculated to about equal ratios. I was hoping to have two batch replicates but in the end Maddy decided it was best to have just 1 batch replicate.
Summary of choices: Based on the number of cells observed in true blank wells, I only include data from wells with >50 cells. Maybe also the stuff below: Based on controls, a misclassification rate of 1% is assumed. This means that Contaminated time-series are removed from the data.
xx load the fluorescence data
xx label the fluorescence data
xx generate a uniqID that is consistent across time points
xx average the Day0 replicates
xx produce figures to summarize the Day0 variation between measurements
xx produce a table summarizing the misclassification rate for each species
xx set a threshold for minimum total counts in a well as determined from true well blanks
xx distinguish extinct wells from true NA wells using OD data
o FOR 24-07-02 INNOCULUM: try to load the .fcs files back into the Attune software to extract the missing Volumes
o FOR AUGUST SAMPLES ONLY: systematically re-check the contaminated/miscalled replicates at 1% threshold
o take some representative screenshots for contamination and for miscalled events
o also take some representative screenshots of the gating strategy
o produce a figure to illustrate miscall/contamination exclusion strategy (Can use a more lenient & a more stringent strategy)??
o use machine learning to classify cells <– I can use the pilot data and the monoculture data as training data??
xx plot the relative densities
xx load the volume data
xx plot the absolute densities
xx plot each sample as a time series
o Diversity vs effect size: get signif for log(Total+1)
o how much correlation is there between total density estimate by Flow vs by OD?
o check to see if the Diversity vs total effect size trend also happens for the OD data
(o if it does, then I will have to double-check that OD is linear with CFU in the range used here)
o Diversity vs effect size: investigate heat duration as numeric instead of factor. To do this, check out the paper that Maddy sent along.
o also run the analysis that Eric Allen mentioned at Lynn Govaert’s seminar
o fit a model to the time-series data
Load in the fluorescence and OD datasets. Annotate the data.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl) # for importing data directly from Excel sheet
library(RColorBrewer) # for changing the colours of plots
library(ggbeeswarm) # for beeswarm plots
library(vegan) # to estimate diversity and for ordination (PCA & NDMA)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.6-8
library(lme4) # for fitting and trouble-shooting GLM's <-- not sure this is needed?
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
##
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
library(glmmTMB) # (ditto as above)
library(DHARMa) # for plotting the residuals when using glmmTMB
## This is DHARMa 0.4.7. For overview type '?DHARMa'. For recent changes, type news(package = 'DHARMa')
library(effsize) # for post-hoc estimate of effect sizes
library(emmeans) # (ditto as above)
## Welcome to emmeans.
## Caution: You lose important information if you filter this package's results.
## See '? untidy'
library(BSDA) # for pairwise t-tests to compare effect sizes between data subsets
##
## Attaching package: 'BSDA'
##
## The following object is masked from 'package:datasets':
##
## Orange
library(ggforce) # for plotting ellipses in ggplot
library(partitionBEFsp) # for paritioning the biodiversity effects
library(ape) # for ordination (PCoA) <-- this isn't actually used for any analysis
##
## Attaching package: 'ape'
##
## The following object is masked from 'package:dplyr':
##
## where
library(ggordiplots) # for ggplotting ellipses around treatment group centroids during ordination
## Loading required package: glue
# set theme for all plots
fave_theme <- theme_light() + # see other options at https://ggplot2.tidyverse.org/reference/ggtheme.html
theme(text = element_text(size=15), # larger text size for titles & axes
panel.grid.major = element_blank(), # remove major gridlines
panel.grid.minor = element_blank()) # remove minor gridlines
theme_set(fave_theme)
# define a palette for plotting the 4 species
species_pal = brewer.pal(9, "Set1")[c(1, 3, 7, 4, 2, 5)] # this is all 6 colours matching with the slide colours
species_pal_only4 = species_pal[c(1,4:6)] # this is the 4 focal strains in alphabetical order
# define a palette for plotting the 3 treatment days
trtmt_pal = brewer.pal(4, "Set2")[c(1,3:4)]
# define a function to find the mode of a vector. Credit to https://stackoverflow.com/questions/2547402/how-to-find-the-statistical-mode
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
Load the fluorescent data from the summary Excel files created by FCS Express. Then load the volume data from the summary .csv files created by the Attune software.
# a function to load the fluorescent counts data (from .xlsx files created by )
import_flow_counts <- function(file)
return(as.data.frame(
read_excel(path=file, sheet="FCS Express Report",
# replace the column names as follows:
col_names = c("Filename",
"Gate1", "Count_grimontii",
"Gate2", "Count_putida",
"Gate3", "Count_protegens",
"Gate4","Count_veronii")))
)
# a function to load and parse the volume data
import_flow_volume <- function(file) {
raw.csv <- read.csv(file)
# keep the volume info and just enough data to identify the sample. Then remove resultant redundant rows
vol_data <- raw.csv %>% select(Plate, Sample, Volume) %>% unique()
}
# a function to loop through the folders containing the data files, open the .xlsx and .csv files and combine their data
import_from_files <- function(dir_vector){
# initiatize variables
raw_counts <- raw_vols <- data.frame()
# loop through each directory
for(dir in dir_vector){
# get all the file names
files_v <- list.files(dir)
# identify the excel files
files_excel <- files_v[endsWith(files_v, ".xlsx")]
# and loop through all of them to extract their data
TMPraw_counts <- data.frame()
for(val in files_excel){
TMPraw_counts <- rbind(TMPraw_counts, import_flow_counts(paste0(dir, "/", val)))
}
# identify the csv files
files_csv <- files_v[endsWith(files_v, ".csv")]
# and loop through all of them to extract their data
TMPraw_vols <- data.frame()
for(val in files_csv){
TMPraw_vols <- rbind(TMPraw_vols, import_flow_volume(paste0(dir, "/", val)))
}
# concatenate the data from counts and from vols
raw_counts <- rbind(raw_counts, TMPraw_counts)
raw_vols <- rbind(raw_vols, TMPraw_vols)
rm(TMPraw_counts, TMPraw_vols)
}
return(list(raw_counts, raw_vols))
}
# get all of the raw data:
list_rawdata <- import_from_files(c("./2July24", "./8July24", "./5Aug24", "./19Aug24"))
Now we can process the data to create unique ID’s for each sample. All of this info needs to be parsed from the Filename column for the flow counts data (i.e., excel files) and from the Plate column for the flow volumes data (i.e., csv files).
# start with flow counts data:
# I got confused and now there are rows containing the column names. Get rid of those...
list_rawdata[[1]] <- list_rawdata[[1]][-grep("Filename", list_rawdata[[1]]$Filename),]
# Day0 has a different pattern in the Filename column so let's process those rows first
Day0 <- list_rawdata[[1]][grep("Day0", list_rawdata[[1]]$Filename),] %>% separate_wider_regex(Filename,
c(Date="24-0\\d-\\d{2}", " Day", Day="\\d",
".*dilution_", Well="\\w\\d+", "\\.acs compensated"))
# Now process the Filename column for the other days
NOTday0 <- list_rawdata[[1]][-grep("Day0", list_rawdata[[1]]$Filename),] %>% separate_wider_regex(Filename,
c(Date="24-0\\d-\\d{2}", " Day", Day="\\d", " -- ",
Incubator="\\w+", "\\.plate", Plate="\\d",
".*dilution_", Well="\\w\\d+", "\\.acs compensated"))
# Put the flow counts data back together into a single data.frame:
raw_Counts <- rbind(Day0 %>% mutate(Incubator=NA, Plate=0), # add in the 2 extra empty columns that are missing from Day0
NOTday0) %>% select(-Gate1, -Gate2, -Gate3, -Gate4)
rm(Day0, NOTday0)
# then do a similar thing for the volume data:
# Day0 has a different pattern in the Plate column so let's process those rows first
Day0 <- list_rawdata[[2]][grep("Day0", list_rawdata[[2]]$Plate),] %>% separate_wider_regex(Plate,
c(Date="24-0\\d-\\d{2}", " Day", Day="\\d", ".*"))
# Now process the Plate column for the other days
NOTday0 <- list_rawdata[[2]][-grep("Day0", list_rawdata[[2]]$Plate),] %>% separate_wider_regex(Plate,
c(Date="24-0\\d-\\d{2}", " Day", Day="\\d", " -- ",
Incubator="\\w+", "\\.plate", Plate="\\d"))
# Put the flow volumes data back together into a single data.frame:
raw_Vol <- rbind(Day0 %>% mutate(Incubator=NA, Plate=0), # add in the 2 extra empty columns that are missing from Day0
NOTday0) %>% rename(Well = Sample) # rename this column for consistency with the Counts data
rm(Day0, NOTday0)
# We can now combine the counts and volume data
# here I need to use left join because we don't have volume data for Day 0 on 24-07-02 !!!!!!
raw_data <- left_join(raw_Counts, raw_Vol,
by=c("Date", "Day", "Well", "Incubator", "Plate"))
rm(raw_Counts, raw_Vol)
# add annotation specifying the Heat treatment and the Incubator
# For 2July24: all samples were subjected to 6h of heat
# For 8July24: samples in the Epoch plate reader are control (no heat)
# Samples in the H1 plate reader are 48h of heat
# For 5Aug24: all samples were subjected to 12h of heat
# For 19Aug24: all samples were subjected to 24h of heat
raw_data$Heat <- 0
raw_data$Heat[which(raw_data$Date == "24-07-02")] <- 6
raw_data$Heat[which(raw_data$Date == "24-07-08" & raw_data$Incubator == "H1")] <- 48
raw_data$Heat[which(raw_data$Date == "24-08-05")] <- 12
raw_data$Heat[which(raw_data$Date == "24-08-19")] <- 24
# change the variable classes for data analysis
raw_data$Count_grimontii <- as.numeric(raw_data$Count_grimontii)
raw_data$Count_putida <- as.numeric(raw_data$Count_putida)
raw_data$Count_protegens <- as.numeric(raw_data$Count_protegens)
raw_data$Count_veronii <- as.numeric(raw_data$Count_veronii)
Finally, we can annotate the data with the sample information for each well. Note that there are different plate layouts for Day0 (same for all dates) And the experiment from 24-07-02 uses a different layout as compared to the rest of the data… But, also, I make other mistakes too so there’s modified layouts for that too! XP
# the "Plate1" layout is used for all days >0 (except for 24-07-02)
layout.plate1 <- data.frame(Well = paste0(LETTERS[1:8], rep((2*1:6)-1, each=8)),
putida = c(0, 1, 1, 1, 0, 0, 0, 0,
1, 0, 0, 0, 1, 1, 1, 0,
0, 0, 1, 1, 1, 0, 1, 0,
0, 0, 1, 1, 1, 0, 1, 0,
1, 1, 1, 0, 1, 0, 0, 0,
0, rep(0,6), 1),
protegens = c(0, 0, 0, 1, 0, 0, 1, 0,
0, 1, 0, 0, 1, 0, 0, 1,
1, 0, 1, 1, 0, 1, 1, 0,
0, 0, 1, 0, 0, 1, 0, 1,
1, 1, 0, 1, 1, 0, 1, 0,
1, rep(0,6), 0),
grimontii = c(0, 0, 1, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 1, 0, 1,
0, 1, 1, 0, 1, 1, 1, 0,
1, 0, 0, 1, 0, 1, 0, 0,
1, 0, 1, 1, 1, 0, 0, 1,
1, rep(0,6), 0),
veronii = c(0, 1, 0, 0, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 1, 0,
1, 1, 0, 1, 1, 1, 1, 0,
0, 1, 0, 0, 1, 0, 0, 0,
0, 1, 1, 1, 1, 0, 1, 1,
0, rep(0,6), 0))
### CommRich = 0 corresponds to blanks, mistakes made on Day0 are removed altogether,
### and CommRich = NA is used to indicate contamination.
# modified layout of plate1 specific for 24-07-02
layout.plate1_2Jul <- layout.plate1
layout.plate1_2Jul$putida[c(1,8, 41:48)] <- c(0, 1, 1, 1, 1, 0, 1, 0, 0, 0)
layout.plate1_2Jul$protegens[c(1,8, 41:48)] <- c(1, 0, 1, 0, 0, 1, 0, 1, 0, 0)
layout.plate1_2Jul$grimontii[c(1,8, 41:48)] <- c(1, 0, 0, 1, 0, 1, 0, 0, 1, 0)
layout.plate1_2Jul$veronii[c(1,8, 41:48)] <- c(0, 0, 0, 0, 1, 0, 0, 0, 0, 1)
# modified layout of plate1 specific for mistakes made on 24-07-08
# column 4 of OD plate is swapped orientation
layout.plate1_8Jul <- layout.plate1
layout.plate1_8Jul$putida[25:32] <- layout.plate1$putida[9:16]
layout.plate1_8Jul$protegens[25:32] <- layout.plate1$protegens[9:16]
layout.plate1_8Jul$grimontii[25:32] <- layout.plate1$grimontii[9:16]
layout.plate1_8Jul$veronii[25:32] <- layout.plate1$veronii[9:16]
# add a column for community richness in all of the above df's
layout.plate1 <- layout.plate1 %>% mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all")
layout.plate1_2Jul <- layout.plate1_2Jul %>% mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all")
layout.plate1_8Jul <- layout.plate1_8Jul %>% mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all")
# the "Plate2" layout
layout.plate2 <- data.frame(Well = paste0(LETTERS[1:8], rep((2*1:6)-1, each=8)),
putida = c(1, 1, 1, 0, 1, 0, 0, 0,
1, 0, 1, 0, 0, 0, 1, 1,
1, 0, 1, 0, 0, 0, 1, 1,
1, 0, 0, 0, 1, 1, 1, 0,
rep(0,7), 0,
0, 1, 0, 1, 1, 1, 0, 0),
protegens = c(1, 0, 0, 1, 0, 1, 0, 0,
0, 1, 1, 0, 1, 0, 1, 1,
0, 1, 0, 1, 0, 0, 1, 0,
1, 0, 1, 0, 1, 1, 0, 1,
rep(0,7), 1,
0, 1, 1, 0, 1, 1, 0, 0),
grimontii = c(0, 1, 0, 1, 0, 0, 1, 0,
1, 1, 1, 0, 0, 1, 1, 0,
0, 1, 0, 0, 1, 0, 0, 1,
1, 0, 0, 1, 1, 0, 1, 1,
rep(0,7), 0,
0, 1, 1, 1, 0, 1, 1, 0),
veronii = c(0, 0, 1, 0, 0, 0, 0, 1,
1, 1, 1, 0, 1, 1, 0, 1,
1, 0, 0, 0, 0, 1, 0, 0,
1, 0, 1, 1, 0, 1, 1, 1,
rep(0,7), 1,
0, 1, 1, 1, 1, 0, 1, 0))
# modified layout of plate2 specific for 24-07-02
layout.plate2_2Jul <- layout.plate2
layout.plate2_2Jul$putida[1:32] <- layout.plate2$putida[c(9:32,41:47,40)]
layout.plate2_2Jul$protegens[1:32] <- layout.plate2$protegens[c(9:32,41:47,40)]
layout.plate2_2Jul$grimontii[1:32] <- layout.plate2$grimontii[c(9:32,41:47,40)]
layout.plate2_2Jul$veronii[1:32] <- layout.plate2$veronii[c(9:32,41:47,40)]
layout.plate2_2Jul <- layout.plate2_2Jul[1:32,] # rest of flow plate 2 is empty
# modified layout of plate2 specific for mistakes made on 24-07-08 and 24-08-19
layout.plate2_8Jul19Aug <- layout.plate2[-(9:16),] # I screwed up column 8 of OD plate
# add a column for community richness in all of the above df's
layout.plate2 <- layout.plate2 %>% mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all")
layout.plate2_2Jul <- layout.plate2_2Jul %>% mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all")
layout.plate2_8Jul19Aug <- layout.plate2_8Jul19Aug %>% mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all")
# the "Inocula" layout
layout.inocula <- data.frame(Well = paste0(LETTERS[1:8], rep((2*1:6)-1, each=8)),
putida = rep(c(1, 0, 0, 0, 1, 1, 1, 0,
0, 0, 1, 1, 1, 0, 1, NA), times=3),
protegens = rep(c(0, 1, 0, 0, 1, 0, 0, 1,
1, 0, 1, 1, 0, 1, 1, NA), times=3),
grimontii = rep(c(0, 0, 1, 0, 0, 1, 0, 1,
0, 1, 1, 0, 1, 1, 1, NA), times=3),
veronii = rep(c(0, 0, 0, 1, 0, 0, 1, 0,
1, 1, 0, 1, 1, 1, 1, NA), times=3)) %>%
mutate(CommRich = putida+protegens+grimontii+veronii , .keep="all") %>%
filter(!is.na(CommRich))
# a function to annotate each data set with the indicated layout
# this will KEEP well blanks!
annotate_samples <- function(layout, select_date, select_plate) {
relevant_data <- raw_data %>% filter(Date==select_date, Plate==select_plate)
# for Innoc, use inner_join to combine the flow data with its annotation
if(select_plate == 0){
output_df <- inner_join(layout, relevant_data, by="Well")
}
if(select_plate != 0) {
output_df <- left_join(merge(layout, relevant_data %>% select(Day, Incubator, Heat) %>% distinct()),
relevant_data, by=c("Well", "Day", "Incubator", "Heat"))
output_df$Date <- select_date
output_df$Plate <- select_plate
}
#I THINK NONE OF THIS IS NEEDED ANYMORE!!!
# # create annotation for the blank wells
# # grab inoculated species from the layout df
# blank_annot <- layout %>% filter(CommRich==0)
# # grab plate annotation from the data file
# blank_data <- relevant_data %>%
# select(-Well, -Count_putida, -Count_protegens, -Count_grimontii, -Count_veronii, -Volume) %>%
# distinct()
# # outer join these 2 df's
# blank_annot <- merge(blank_annot, blank_data) %>%
# # add missing columns as NA values
# mutate(Count_putida=NA, Count_protegens=NA, Count_grimontii=NA, Count_veronii=NA, Volume=NA)
#
# # combine the annotated data with the annotated blanks
# output_df <- rbind(output_df, blank_annot)
#
# # sometimes I have flow data from real well blanks. i.e., these have real Counts != NA
# # this will produce redundant rows where the Counts == NA
# # due to the rbind above, these redundant rows will always appear lower down in output_df
# output_df <- output_df[!duplicated(output_df %>% select(Well, Day, Heat)),]
return(output_df)
rm(relevant_data, output_df)#, blank_annot, blank_data)
}
# now we can add the sample names for each one.
annotated.rawdata <- rbind(annotate_samples(layout = layout.inocula, select_date = "24-07-02", select_plate=0),
annotate_samples(layout = layout.plate1_2Jul, select_date = "24-07-02", select_plate=1),
annotate_samples(layout = layout.plate2_2Jul, select_date = "24-07-02", select_plate=2),
annotate_samples(layout = layout.inocula, select_date = "24-07-08", select_plate=0),
annotate_samples(layout = layout.plate1_8Jul, select_date = "24-07-08", select_plate=1),
annotate_samples(layout = layout.plate2_8Jul19Aug, select_date = "24-07-08", select_plate=2), ##
annotate_samples(layout = layout.inocula, select_date = "24-08-05", select_plate=0),
annotate_samples(layout = layout.plate1, select_date = "24-08-05", select_plate=1),
annotate_samples(layout = layout.plate2, select_date = "24-08-05", select_plate=2),
annotate_samples(layout = layout.inocula, select_date = "24-08-19", select_plate=0),
annotate_samples(layout = layout.plate1, select_date = "24-08-19", select_plate=1),
annotate_samples(layout = layout.plate2_8Jul19Aug, select_date = "24-08-19", select_plate=2))
# fixing other small mistakes in annotation:
# Day1 of 24-07-02: sample A1 from plate 2 was loaded into sample A1 plate 1.
annotated.rawdata$CommRich[which(annotated.rawdata$Date=="24-07-02" & annotated.rawdata$Day=="1" &
annotated.rawdata$Well=="A1" & annotated.rawdata$Plate=="1")] <- 3
annotated.rawdata$putida[which(annotated.rawdata$Date=="24-07-02" & annotated.rawdata$Day=="1" &
annotated.rawdata$Well=="A1" & annotated.rawdata$Plate=="1")] <- 1
annotated.rawdata$protegens[which(annotated.rawdata$Date=="24-07-02" & annotated.rawdata$Day=="1" &
annotated.rawdata$Well=="A1" & annotated.rawdata$Plate=="1")] <- 0
annotated.rawdata$grimontii[which(annotated.rawdata$Date=="24-07-02" & annotated.rawdata$Day=="1" &
annotated.rawdata$Well=="A1" & annotated.rawdata$Plate=="1")] <- 1
annotated.rawdata$veronii[which(annotated.rawdata$Date=="24-07-02" & annotated.rawdata$Day=="1" &
annotated.rawdata$Well=="A1" & annotated.rawdata$Plate=="1")] <- 1
# Annotate the treatments
annotated.rawdata$Heat_Day <- as.numeric(NA)
annotated.rawdata$Heat_Day[which(annotated.rawdata$Heat!=0 & annotated.rawdata$Day==1)] <- 1
annotated.rawdata$Heat_Day[which(annotated.rawdata$Heat>6 & annotated.rawdata$Day==2)] <- 2
annotated.rawdata$Heat_Day[which(annotated.rawdata$Heat==48 & annotated.rawdata$Day==3)] <- 3
annotated.rawdata$Recov_Day <- as.numeric(NA)
annotated.rawdata$Recov_Day[which(annotated.rawdata$Heat==6 & annotated.rawdata$Day==2)] <- 1
annotated.rawdata$Recov_Day[which(annotated.rawdata$Heat==6 & annotated.rawdata$Day==3)] <- 2
annotated.rawdata$Recov_Day[which(annotated.rawdata$Heat %in% c(12,24) & annotated.rawdata$Day==3)] <- 1
annotated.rawdata$Recov_Day[which(annotated.rawdata$Heat %in% c(12,24) & annotated.rawdata$Day==4)] <- 2
annotated.rawdata$Recov_Day[which(annotated.rawdata$Heat==48 & annotated.rawdata$Day==4)] <- 1
annotated.rawdata$Recov_Day[which(annotated.rawdata$Heat==48 & annotated.rawdata$Day==5)] <- 2
# sanity check to make sure there are no redundant rows
stopifnot(!any(duplicated(annotated.rawdata %>% select(Date, Day, Incubator, Plate, Well))))
# change some of the values to more appropriate types
annotated.rawdata$Plate <- as.numeric(annotated.rawdata$Plate)
annotated.rawdata$Day <- as.numeric(annotated.rawdata$Day)
# clean up
rm(layout.inocula, layout.plate1, layout.plate1_2Jul, layout.plate1_8Jul, layout.plate2, layout.plate2_2Jul, layout.plate2_8Jul19Aug, list_rawdata, raw_data)
The annotated data contains information on the complete dataset,
including blank wells and excluded wells. Any mistakes there were made
during inoculation on Day 0 have been removed altogether.
CommRich == 0 indicates well blanks that should be empty
(in this case, all 4 species columns will also be == 0).
Finally, CommRich == NA indicates data rows that were
excluded; e.g., due to low total counts or contamination (in this case,
the 4 species columns will be kept to indicate what should have been in
that excluded well).
For reproducibility and checking that the metadata is correctly
associated with the data, print the metadata out to file. Note that the
location on the incubated plates (corresponding to OD data) is different
from the location on the flow cytometery plate. In the code below I
create a column for the OD_Well and assign unique identifiers for each
time series. The metadata file annotation_for_alldata.csv
summarizes all of this info.
# annotation for Day 0 lists the plate as plate 0 but let's change that to Innoc
annotated.rawdata$Plate[which(annotated.rawdata$Plate==0)] <- "Innoc"
# copy the metadata to another variable and remove the data columns
metadata <- annotated.rawdata %>% select(-Volume,
-Count_grimontii, -Count_protegens, -Count_putida, -Count_veronii)
# the columns currently labeled as "Well" and "plate" is actually only true for the location of the sample on the flow cytometer data
metadata$filler <- "plate"
metadata <- metadata %>% unite(col="plateNum", c(filler, Plate), sep="", remove = FALSE) %>%
unite(col="FLOWplateWell", c(plateNum, Well), sep="-", remove = FALSE) %>% select(-filler, -plateNum)
#####
# add true well sample location to metadata (i.e., as corresponding to OD data)
#####
# split up the Well into separate columns for the row and column location
metadata <- metadata %>% separate_wider_regex(Well, c(row="\\w", col="\\d+"))
metadata$REALcol <- 0
# for non-Innoc days after 2 July, the pattern is actually very simple and systematic
metadata$REALcol[which(metadata$Plate==1 & metadata$Date > "24-07-02" & metadata$col==1)] <- 1
metadata$REALcol[which(metadata$Plate==1 & metadata$Date > "24-07-02" & metadata$col==3)] <- 2
metadata$REALcol[which(metadata$Plate==1 & metadata$Date > "24-07-02" & metadata$col==5)] <- 3
metadata$REALcol[which(metadata$Plate==1 & metadata$Date > "24-07-02" & metadata$col==7)] <- 4
metadata$REALcol[which(metadata$Plate==1 & metadata$Date > "24-07-02" & metadata$col==9)] <- 5
metadata$REALcol[which(metadata$Plate==1 & metadata$Date > "24-07-02" & metadata$col==11)] <- 6
metadata$REALcol[which(metadata$Plate==2 & metadata$Date > "24-07-02" & metadata$col==1)] <- 7
metadata$REALcol[which(metadata$Plate==2 & metadata$Date > "24-07-02" & metadata$col==3)] <- 8
metadata$REALcol[which(metadata$Plate==2 & metadata$Date > "24-07-02" & metadata$col==5)] <- 9
metadata$REALcol[which(metadata$Plate==2 & metadata$Date > "24-07-02" & metadata$col==7)] <- 10
metadata$REALcol[which(metadata$Plate==2 & metadata$Date > "24-07-02" & metadata$col==9)] <- 11
metadata$REALcol[which(metadata$Plate==2 & metadata$Date > "24-07-02" & metadata$col==11)] <- 12
# for non-Innoc days on 2 July, the pattern is similar for plate 1 columns 1 to 9:
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==1)] <- 1
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==3)] <- 2
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==5)] <- 3
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==7)] <- 4
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==9)] <- 5
# the pattern changes from here:
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==11)] <- 7
metadata$REALcol[which(metadata$Plate==2 & metadata$Date == "24-07-02" & metadata$col==1)] <- 8
metadata$REALcol[which(metadata$Plate==2 & metadata$Date == "24-07-02" & metadata$col==3)] <- 9
metadata$REALcol[which(metadata$Plate==2 & metadata$Date == "24-07-02" & metadata$col==5)] <- 10
metadata$REALcol[which(metadata$Plate==2 & metadata$Date == "24-07-02" & metadata$col==7)] <- 12
# Note that plate 2 Well H7 on flow actually comes from H11
metadata$REALcol[which(metadata$Plate==2 & metadata$Date == "24-07-02" & metadata$col==7 & metadata$row=="H")] <- 11
# and for plate 1 column 1, Well H1 on flow actually comes from H6
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$col==1 & metadata$row=="H")] <- 6
# finally, plate 1 column1: Well A1 on flow actually comes from A6. But note the mistake on Day1
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$Day!=1 & metadata$col==1 & metadata$row=="A")] <- 6
# on 2 July Day 1, plate 1 well A1 on flow actually comes from A8
metadata$REALcol[which(metadata$Plate==1 & metadata$Date == "24-07-02" & metadata$Day==1 & metadata$col==1 & metadata$row=="A")] <- 8
# now we are finished with the NON-INNOC annotations
# we can put together the row and REALcol columns to get the location on the OD plate
data_meta <- metadata %>% filter(Plate != "Innoc") %>% unite("OD_well", c(row, REALcol), sep="") %>% select(-col) %>%
unite("uniqID", c(Date, Incubator, OD_well), sep=" ", remove = FALSE)
# last (and perhaps least), annotate the additional blank wells from 2 July,
july2_blanks <- data_meta %>% filter(Date=="24-07-02", CommRich==0) %>%
select(-FLOWplateWell, -Plate, -uniqID, -OD_well) %>% distinct()
missing_blanks <- data.frame(OD_well=c("A1", "H1", "H12", paste0(LETTERS[1:7],11), paste0(LETTERS[2:7],6)),
FLOWplateWell=NA, Plate=NA) %>%
mutate(uniqID=paste("24-07-02 Epoch", OD_well), .keep="all")
july2_missing <- merge(july2_blanks,missing_blanks)
data_meta <- rbind(data_meta, july2_missing)
rm(july2_blanks, missing_blanks, july2_missing)
#####
# Innoc data: add OD_well and uniqID columns
#####
# In order to annotate the most raw version of the data, I decided to create redundant rows for the Innoc data. This way each row from Innoc appears 5x with its associated OD_well and uniqID.
innoc_meta <- metadata %>% filter(Plate == "Innoc") %>% select(-row, -col, -REALcol, -Incubator, -Heat)
innoc_meta <- suppressWarnings( # we expect left_join to be upset about many-to-many relationship, no need to issue warning.
left_join(innoc_meta,
data_meta %>%
select(-FLOWplateWell, -Day, -Plate, -Heat_Day, -Recov_Day) %>%
distinct(), # remove the redundant rows from each day
by = c("CommRich", "putida", "protegens", "grimontii", "veronii", "Date"))
)
# trash the now old df to avoid confusion
rm(metadata)
# save the complete metadata to file
write.csv(rbind(data_meta, innoc_meta), file="./annotation_for_alldata.csv", quote=FALSE, row.names=FALSE)
#####
# Save the fully annotated raw flow cytometry counts data
#####
# associate the metadata back with the raw counts data
# for Days > 0:
temp_metadata <- data_meta %>% separate_wider_regex(FLOWplateWell, c(FLOW="plate\\w+-", Well="\\w+"))
annot.days <- inner_join(temp_metadata, annotated.rawdata,
by=c("Well", "putida", "protegens", "grimontii", "veronii", "CommRich", "Date",
"Day", "Incubator", "Plate", "Heat", "Heat_Day", "Recov_Day")) %>%
unite("FLOWplateWell", c(FLOW, Well), sep="")
rm(temp_metadata, data_meta)
# for Innoc Days:
temp_metainnoc <- innoc_meta %>% separate_wider_regex(FLOWplateWell, c(FLOW="plate\\w+-", Well="\\w+"))
annot.innoc <- left_join(temp_metainnoc,
annotated.rawdata %>% select(-Incubator, -Heat),
by=c("Well", "putida", "protegens", "grimontii", "veronii", "CommRich", "Date",
"Day", "Plate", "Heat_Day", "Recov_Day")) %>%
unite("FLOWplateWell", c(FLOW, Well), sep="")
# save this to file as well
write.csv(rbind(annot.days, annot.innoc), file="./flow_rawdata.csv", quote=FALSE, row.names=FALSE)
# remove annotated.rawdata as it has been superseded by annot.days and annot.innoc
rm(annotated.rawdata, temp_metainnoc, innoc_meta)
########
# fix annotation mistake for uniqID "24-07-02 Epoch A6"
# this well is missing on Day 1 bc A8 was pipetted there instead. But now we have 2 wells for "24-07-02 Epoch A8"...
########
wrong_row <- which(annot.days$uniqID == "24-07-02 Epoch A8" & annot.days$Day == 1 & is.na(annot.days$Volume))
annot.days$uniqID[wrong_row] <- "24-07-02 Epoch A6"
annot.days$OD_well[wrong_row] <- "A6"
annot.days$putida[wrong_row] <- 0
annot.days$protegens[wrong_row] <- 1
annot.days$grimontii[wrong_row] <- 1
annot.days$veronii[wrong_row] <- 0
annot.days$CommRich[wrong_row] <- 2
rm(wrong_row)
The data from Day 0 (annot.innoc) is 3x measurements of
the innoculum that is used to inoculate the up to 5 batch replicate
time-series. For the summary annotation file (aka metadata above), each
FLOWplateWell appears redundantly with the up to 5
associated uniqID and OD_well replicates. I chose to do this so that the
raw values from the flow cytometry data are preserved with their
relevant annotation.
Below, this Day 0 data is averaged across the 3 different
FLOWplateWell values. This adds the Day 0 data to the time
series.
# average the Day0 data actually across its redundant flow cytometery measurements...
mean.innoc <- annot.innoc %>% group_by(uniqID, OD_well, Incubator, Plate, Heat, Date,
Day, Heat_Day, Recov_Day,
CommRich, putida, protegens, grimontii, veronii) %>%
summarise(Mean_putida = mean(Count_putida),
Mean_protegens = mean(Count_protegens),
Mean_grimontii = mean(Count_grimontii),
Mean_veronii = mean(Count_veronii),
SD_putida = sd(Count_putida),
SD_protegens = sd(Count_protegens),
SD_grimontii = sd(Count_grimontii),
SD_veronii = sd(Count_veronii),
Vol_mean = mean(Volume),
vol_sd = sd(Volume))
## `summarise()` has grouped output by 'uniqID', 'OD_well', 'Incubator', 'Plate',
## 'Heat', 'Date', 'Day', 'Heat_Day', 'Recov_Day', 'CommRich', 'putida',
## 'protegens', 'grimontii'. You can override using the `.groups` argument.
# here's some plots to summarize how much variation there is between measurements of the same inocula
plotting_mean.innoc <- mean.innoc %>% pivot_longer(cols = Mean_putida:SD_veronii,
names_to = c(".value", "species"),
names_sep = "_") %>%
filter(Incubator != "H1") # the same innoculum was used for 2 treatments on 24-07-08. Remove this redundancy for plotting
ggplot(plotting_mean.innoc,
aes(x=Mean, y=SD, colour=species)) +
geom_point(alpha=0.7) +
scale_colour_manual(values=species_pal_only4) +
labs(title="3 measures of innoculum")
ggplot(plotting_mean.innoc,
aes(x=Mean, y=SD, colour=Date)) +
geom_point(alpha=0.7) +
labs(title="3 measures of innoculum")
ggplot(plotting_mean.innoc %>% mutate(CV = SD/Mean),
aes(x=Mean, y=CV, colour=species)) +
geom_point(alpha=0.7) +
scale_colour_manual(values=species_pal_only4) +
labs(title="3 measures of innoculum")
## Warning: Removed 159 rows containing missing values or values outside the scale range
## (`geom_point()`).
# due to false positive counts,
# the CV blows up when I am counting species that are not actually in that sample
ggplot(plotting_mean.innoc %>% filter((putida == 1 & species == "putida") |
(protegens == 1 & species == "protegens") |
(grimontii == 1 & species == "grimontii") |
(veronii == 1 & species == "veronii")) %>%
mutate(CV = SD/Mean),
aes(x=Mean, y=CV, colour=species)) +
geom_point(alpha=0.7) +
scale_colour_manual(values=species_pal_only4) +
labs(title="3 measures of innoculum (remove absent sp)")
ggplot(plotting_mean.innoc %>% unite("community", putida:veronii),
aes(x=community, y=Mean, colour=species)) +
geom_point(alpha=0.7) +
geom_errorbar(aes(ymin=Mean-SD, ymax=Mean+SD), width=.2) +
scale_colour_manual(values=species_pal_only4) +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
labs(title="3 measures of innoculum")
ggplot(plotting_mean.innoc %>% filter((putida == 1 & species == "putida") |
(protegens == 1 & species == "protegens") |
(grimontii == 1 & species == "grimontii") |
(veronii == 1 & species == "veronii")) %>%
unite("community", putida:veronii),
aes(x=community, y=Mean, colour=species)) +
geom_point(alpha=0.7) +
geom_errorbar(aes(ymin=Mean-SD, ymax=Mean+SD), width=.2) +
scale_colour_manual(values=species_pal_only4) +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
labs(title="3 measures of innoculum (remove absent sp)")
ggplot(plotting_mean.innoc %>% ungroup() %>%
select(-uniqID, -OD_well) %>% distinct() %>%
filter((putida == 1 & species == "putida") |
(protegens == 1 & species == "protegens") |
(grimontii == 1 & species == "grimontii") |
(veronii == 1 & species == "veronii")) %>%
unite("community", putida:veronii) %>% mutate(CV = SD/Mean),
aes(x=community, y=CV, colour=species)) +
geom_jitter(width=0.2, alpha=0.7) +
scale_colour_manual(values=species_pal_only4) +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
labs(title="3 measures of innoculum (remove absent sp)")
# I have no idea what I would do with this info about sample volume, but here it is:
# There's no values for 24-07-02 because I accidentally forgot to save the .apx file (this is before I realized that volume is not saved in the .acs files)
ggplot(plotting_mean.innoc,
aes(x=Vol_mean, y=vol_sd, colour=Date)) +
geom_point(alpha=0.7) +
labs(title="3 measures of innoculum")
## Warning: Removed 300 rows containing missing values or values outside the scale range
## (`geom_point()`).
# finally, we can add the mean counts for the Innoc to the whole data
annotated.rawdata <- mean.innoc %>% select(-SD_putida, -SD_protegens, -SD_grimontii, -SD_veronii, -vol_sd) %>%
rename(Count_putida=Mean_putida, Count_protegens=Mean_protegens, Count_grimontii=Mean_grimontii, Count_veronii=Mean_veronii, Volume=Vol_mean)
annotated.rawdata <- rbind(annotated.rawdata, annot.days)
# cleanup
rm(annot.days, annot.innoc, plotting_mean.innoc)
# use Day0 innoculum measurements for a first pass at estimating the misclassification rate
# i.e., the rate of falsely classifying as species A when I know for certain that species A is not present in my sample
misclass.innoc <- mean.innoc %>% mutate(Total_counts = Mean_putida + Mean_protegens + Mean_grimontii + Mean_veronii) %>% # get total for each sample
ungroup() %>% select(-uniqID, -OD_well) %>% distinct() %>% # remove any redundant data
# put each species count in its own row in the column called mean (instead of having a column for each species)
pivot_longer(cols = Mean_putida:SD_veronii,
names_to = c(".value", "species"),
names_sep = "_") %>%
filter(Incubator != "H1") %>% # remove the redundant data
# keep just the instances where we know for sure that this species was NOT present
filter((putida == 0 & species == "putida") |
(protegens == 0 & species == "protegens") |
(grimontii == 0 & species == "grimontii") |
(veronii == 0 & species == "veronii")) %>%
# misclassification rate is the number of events / total counts
mutate(mean_rate = Mean/Total_counts,
sd_rate = SD/Total_counts)
# re-order the species from fast to slow for better plotting
misclass.innoc$species <- factor(misclass.innoc$species,
levels = c("putida", "protegens", "grimontii", "veronii"))
ggplot(misclass.innoc, aes(x=species, y=mean_rate, colour=species)) +
geom_beeswarm(alpha=0.5) +
geom_errorbar(aes(ymin=mean_rate-sd_rate, ymax=mean_rate+sd_rate), width=.05, alpha=0.2) +
scale_colour_manual(values=species_pal_only4) +
labs(title="misclassification rate in innoculum", y="mean +/- SD")
max(misclass.innoc$mean_rate)
## [1] 0.008537981
ggplot(misclass.innoc %>% unite("community", putida:veronii),
aes(x=species, y=mean_rate, colour=species)) +
facet_wrap(vars(community)) +
geom_point(alpha=0.5) +
scale_y_continuous(breaks = c(0, 0.005, 0.01)) +
scale_colour_manual(values=species_pal_only4) +
theme(axis.text.x = element_text(angle = 90)) +
labs(title="misclassification rate in innoculum")
# summarize the mean and max misclassification rates observed for each species
misclass.innoc %>% group_by(species) %>% summarise(mean_misclass = mean(mean_rate),
max_misclass = max(mean_rate))
# clean-up
rm(misclass.innoc, mean.innoc)
I define the misclassification rate as \(\frac{\text{false positive events}}{\text{total events across all fluorescences}}\). In other words, I am counting the number of events in the gate(s) where I know there should be zero then dividing by the total number of fluorescent events in that well.
From here we can clearly see that the misclassification rate can be as bad as 1% and that it depends on the species. Protegens is the most likely to be misclassified and, from the plot of all possible community combinations, we see that the problem seems to be that putida cells are being misclassified as belonging to protegens.
But I know that this rate of misclassification also depends on environmental conditions. So I don’t think it makes sense to correct the data using the exact values given above. The more cautious approach would be to treat with caution any counts that are less than 1%.
QUESTION: do I need a plot of the mean fluorescence value for different species on different days? I think I would like to see this… Which means that I will need to go back to the FCS Express data to extract it. :(
Here we begin to make decisions about which data to keep and which to toss.
I need to set a threshold for the minimum number of fluorescent events observed in a well in order for me to decide that the well is not trustworthy.
At some point I did sample some wells that are true negatives. From this we learn that a true negative can have as many as 20 total events.
Remember that I set the stopping conditions for 10 000 events in the cell gate OR until it reaches the end of the sample (which seems to be 146uL). Let’s rather arbitrarily set the minimum total events in the well at 51 and see what happens with that.
annotated.rawdata <- annotated.rawdata %>% mutate(Total_counts = Count_putida + Count_protegens + Count_grimontii + Count_veronii) %>%
mutate(Total_density = Total_counts/Volume)
# plot the counts and volume for true negative wells
ggplot(annotated.rawdata %>% filter(CommRich==0, !is.na(Total_counts)),
aes(x=Total_counts, y=Volume)) +
geom_point() +
labs(title="True negatives")
# plot the total counts as a histogram just to see what the dispersal is like
ggplot(annotated.rawdata, aes(x=Total_counts)) +
geom_histogram(colour="black", fill="white") +
labs(title="everything")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 387 rows containing non-finite outside the scale range
## (`stat_bin()`).
ggplot(annotated.rawdata, aes(x=Total_counts)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_counts in LOG SCALE!", title="everything")
## Warning in scale_x_log10(): log-10 transformation introduced infinite values.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 396 rows containing non-finite outside the scale range
## (`stat_bin()`).
ggplot(annotated.rawdata, aes(x=Total_counts)) +
geom_histogram(colour="black", fill="white") +
scale_x_continuous(limits = c(-10,1010)) +
labs(title="everything")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 2014 rows containing non-finite outside the scale range
## (`stat_bin()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_bar()`).
# then plot the total counts against the volume because we expect these very low counts should be associated with the highest volumes
ggplot(annotated.rawdata, aes(x=Total_counts, y=Volume, colour=as.factor(Heat_Day))) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
labs(colour="Day of heat") +
labs(title="everything")
## Warning: Removed 1652 rows containing missing values or values outside the scale range
## (`geom_point()`).
# okay, let's just see a histogram of the total cell density
ggplot(annotated.rawdata, aes(x=Total_density)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_density in LOG SCALE!") +
labs(title="everything")
## Warning in scale_x_log10(): log-10 transformation introduced infinite values.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 471 rows containing non-finite outside the scale range
## (`stat_bin()`).
ggplot(annotated.rawdata %>% filter(!is.na(Heat_Day)),
aes(x=Total_density)) +
facet_grid(rows = vars(Heat_Day)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_density in LOG SCALE!", title="Day of Heat (everything)")
## Warning in scale_x_log10(): log-10 transformation introduced infinite values.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 161 rows containing non-finite outside the scale range
## (`stat_bin()`).
ggplot(annotated.rawdata %>% filter(!is.na(Recov_Day)),
aes(x=Total_density)) +
facet_grid(rows = vars(Recov_Day)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_density in LOG SCALE!", title="Day of Recovery (everything)")
## Warning in scale_x_log10(): log-10 transformation introduced infinite values.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 133 rows containing non-finite outside the scale range
## (`stat_bin()`).
### check what these graphs look like when I exclude wells where Total_counts < 51
ggplot(annotated.rawdata %>% filter(Total_counts > 50),
aes(x=Total_counts, y=Volume, colour=as.factor(Heat_Day))) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
labs(colour="Day of heat") +
labs(title="Total_counts > 50")
## Warning: Removed 1218 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(annotated.rawdata %>% filter(Total_counts > 50),
aes(x=Total_density)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_density in LOG SCALE!") +
labs(title="Total_counts > 50")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 75 rows containing non-finite outside the scale range
## (`stat_bin()`).
ggplot(annotated.rawdata %>% filter(Total_counts > 50, !is.na(Heat_Day)),
aes(x=Total_density)) +
facet_grid(rows = vars(Heat_Day)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_density in LOG SCALE!", title="Day of Heat (Total_counts > 50)")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
ggplot(annotated.rawdata %>% filter(Total_counts > 50, !is.na(Recov_Day)),
aes(x=Total_density)) +
facet_grid(rows = vars(Recov_Day)) +
geom_histogram(colour="black", fill="white") +
scale_x_log10() +
labs(x="Total_density in LOG SCALE!", title="Day of Recovery (Total_counts > 50)")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#######
# set threshold of > 50 events in total
#######
# copy everything EXCEPT BLANK WELLS to new variable
the.data <- annotated.rawdata %>% filter(CommRich != 0)
# summarize some information about the data points that I'm about to exclude
the.data %>% filter(Total_counts < 51) %>% ungroup() %>% select(uniqID, Heat, Day, Heat_Day, Recov_Day, CommRich, Volume, Total_counts) %>% summary()
## uniqID Heat Day Heat_Day
## Length:87 Min. : 0.00 Min. :2.000 Min. :2.00
## Class :character 1st Qu.:48.00 1st Qu.:3.000 1st Qu.:2.75
## Mode :character Median :48.00 Median :4.000 Median :3.00
## Mean :41.66 Mean :3.621 Mean :2.75
## 3rd Qu.:48.00 3rd Qu.:4.000 3rd Qu.:3.00
## Max. :48.00 Max. :5.000 Max. :3.00
## NA's :47
## Recov_Day CommRich Volume Total_counts
## Min. :1.000 Min. :1.000 Min. :145.0 Min. : 0.00
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:145.0 1st Qu.: 2.50
## Median :1.000 Median :2.000 Median :145.0 Median : 7.00
## Mean :1.462 Mean :1.782 Mean :145.4 Mean :10.72
## 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:146.0 3rd Qu.:15.00
## Max. :2.000 Max. :4.000 Max. :146.0 Max. :50.00
## NA's :48
# exclude from analysis all non-blanks rows where Total_counts < 51
the.data$CommRich[which(the.data$Total_counts < 51)] <- NA
# put the blank data back with the whole dataset
the.data <- rbind(the.data,
annotated.rawdata %>% filter(CommRich == 0))
# replace the data with NA values for all rows where Total_counts < 51
# this includes both the excluded unreliable data as well as the true blanks flow data
the.data$Count_putida[which(the.data$Total_counts < 51)] <- NA
the.data$Count_protegens[which(the.data$Total_counts < 51)] <- NA
the.data$Count_grimontii[which(the.data$Total_counts < 51)] <- NA
the.data$Count_veronii[which(the.data$Total_counts < 51)] <- NA
the.data$Total_density[which(the.data$Total_counts < 51)] <- NA
the.data$Total_counts[which(the.data$Total_counts < 51)] <- NA
I have re-assigned all wells that had less than 51 total fluorescent events as NA values. This was a total of 87 wells.
Note that I’ve also removed any flow cytometry data from the true negative wells. This was 17 wells.
# calculate densities and relative abundances for each species
the.data <- the.data %>% mutate(Conc_putida = Count_putida/Volume,
Conc_protegens = Count_protegens/Volume,
Conc_grimontii = Count_grimontii/Volume,
Conc_veronii = Count_veronii/Volume,
relDen_putida = Count_putida/Total_counts,
relDen_protegens = Count_protegens/Total_counts,
relDen_grimontii = Count_grimontii/Total_counts,
relDen_veronii = Count_veronii/Total_counts) #%>%
#select(-Total_counts)
# sanity check that the relative densities are always adding up to 1
check <- the.data %>% mutate(sum_relDen = relDen_putida + relDen_protegens + relDen_grimontii + relDen_veronii) %>%
# for convenience, remove the 87 NA values
drop_na(Total_counts)
all.equal(check$sum_relDen, rep(1, nrow(check))) %>% # use all.equal() as there seem values very close to 1 but not exactly equal to 1
stopifnot()
rm(check)
I have calculated the relative densities and made sure that all relative densities add up to 1.
Before diving deeper into the data, let’s just see quickly what the time series look like.
# check: is each replicated time series annotated appropriately so that it can be pieced together?
the.data <- the.data %>% unite("community", putida:veronii, remove=FALSE) %>% ungroup()
for(com in unique(the.data$community)) {
plot( ggplot(the.data %>% filter(community==com) %>%
select(uniqID, Heat, Day, relDen_putida, relDen_protegens, relDen_grimontii, relDen_veronii) %>%
pivot_longer(cols=starts_with("relDen"), names_to="species", names_prefix="relDen_", values_to="relDensity"),
aes(x=Day, y=relDensity, colour=species, group=paste(uniqID,Heat,species))) +
facet_grid(~Heat) +
geom_point(alpha=0.2) +
geom_line(alpha=0.5) +
scale_colour_manual(values=species_pal_only4) +
labs(title=com))
}
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 28 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 92 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 48 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 40 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 36 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 28 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 40 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 32 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 68 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 64 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 32 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1516 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1516 rows containing missing values or values outside the scale range
## (`geom_line()`).
After staring at the above time series for long enough, two things become clear
Protegens has contaminated several wells. This is unambiguous
contamination when it is present in communities where it was not
innoculated. For these contaminated replicates, the entire time-series
will be excluded from the analysis.
(But this also brings up the danger of contamination in wells where that
species was already inoculated: e.g., putida could be an
example of this in community 1_1_1_0 on day 4 under 12 hrs heat,
corresponding with uniqID 24-08-05 Epoch A5. Here, putida
goes to low density except for on the final day in just one replicate.
We have absolutely no way to know if this is a true signal or
contamination.)
The misclassification rate varies over time: e.g., putida is misclassified in a protegens monoculture (0_1_0_0) on day 1 for 3 different heat treatments. It appears in that well with a density of > 10% ! As well, protegens and veronii are misclassified in putida monoculture (1_0_0_0) on days 2 and 3 of 24 hrs heat.
Of the 87 NA values identified above,
Some occurred as a result of flow cytometry issues. E.G., there
was probably a bubble that I didn’t notice. (When I noticed the bubble,
I would re-run that well. But this is only after I began to understand
that this was happening. So some wells were unfortunately lost because
of this error.)
IN THIS CASE: this is a true NA value. It only happens at one time point
(which may or may not be a heat day). And there is data for this well
during the recovery period.
Some occurred as a result of prolonged heat exposure that dropped
the total density in that well below the threshold of detection. This
only happened on day 3 of heat for the longest heat treatment. There is
data for this well during the recovery period.
IN THIS CASE: this is a true NA value.
Others occurred as a result of prolonged heat exposure that drove
the well to complete extinction. There is no flow cytometery data for
this well during the recovery period because it went extinct. Extinction
needs to be confirmed against the OD data.
IN THIS CASE: this is a true NA value during the heat treatment but it
should become a 0 value during the recovery period.
The OD data is analyzed in the file called
OD_temp_serial_transfer_expt--28Oct24.Rmd. This script
outputs a csv file indicating the extinct wells, which I will use
below.
# import extinct well data from file
extinct <- read.csv("./extinctOD_wells.csv")
# we know that there was no detectable growth on Recovery days. So replace the current values with true 0's here.
the.data[which(the.data$uniqID %in% extinct$uniqID & the.data$Recov_Day>0),] <- the.data[which(the.data$uniqID %in% extinct$uniqID & the.data$Recov_Day>0),] %>%
mutate(Total_density=0, Conc_putida=0, Conc_protegens=0, Conc_grimontii=0, Conc_veronii=0,
relDen_putida=0, relDen_protegens=0, relDen_grimontii=0, relDen_veronii=0,
CommRich=putida+protegens+grimontii+veronii)
# during the heat days, we know that there was no OD-detectable growth for (extinct$Day + 1).
# This means any flow cytometry data we have is unreliable and should be replaced with NA.
# wells where Day 2 is unreliable
tmp <- extinct %>% filter(Day == 1)
the.data[which(the.data$uniqID %in% tmp$uniqID & the.data$Day==2),] <- the.data[which(the.data$uniqID %in% tmp$uniqID & the.data$Day==2),] %>%
mutate(Total_density=NA, Conc_putida=NA, Conc_protegens=NA, Conc_grimontii=NA, Conc_veronii=NA,
relDen_putida=NA, relDen_protegens=NA, relDen_grimontii=NA, relDen_veronii=NA,
CommRich=NA)
rm(tmp)
# wells where Day 3 is unreliable (and Day 3 is a heat day!)
extinct <- extinct[-which(extinct$uniqID %in% c("24-08-19 Epoch B4", "24-08-19 Epoch D2")),]
the.data[which(the.data$uniqID %in% extinct$uniqID & the.data$Day==3),] <- the.data[which(the.data$uniqID %in% extinct$uniqID & the.data$Day==3),] %>%
mutate(Total_density=NA, Conc_putida=NA, Conc_protegens=NA, Conc_grimontii=NA, Conc_veronii=NA,
relDen_putida=NA, relDen_protegens=NA, relDen_grimontii=NA, relDen_veronii=NA,
CommRich=NA)
rm(extinct)
To address both the problem of contamination & the problem of the misclassification rate varying over time, I had to closely re-examine the flow cytometry data. (If I had more/longer time-series, I might consider to apply a smoothing to the data… But that doesn’t make sense here.)
Let’s first exclude all instances where a species that was not innoculated there is found at >90% relative density. These are unambiguous contamination events and I don’t need to waste my time scouring through these flow cytometry raw data files.
Let’s apply the 1% misclassification rate that was found for Day 0:
# identify contamination at 1%
contamin.df <- the.data %>% filter((putida == 0 & relDen_putida > 0.01) |
(protegens == 0 & relDen_protegens > 0.01) |
(grimontii == 0 & relDen_grimontii > 0.01) |
(veronii == 0 & relDen_veronii > 0.01))
contamin.df %>% filter(Date %in% c("24-08-05", "24-08-19")) %>% select(Date, FLOWplateWell, Day, community,
relDen_putida, relDen_protegens, relDen_grimontii, relDen_veronii)
We know from the OD data that on 24h of heat treatment there was no contamination detected. (We also know from that same data that on Day 1 of all treatments there’s never any contamination detected for any treatment.) As the misclassification rate changes with environmental conditions, let’s use the 24h heat treatment data to look at the misclassification rate.
IT SHOULD ALSO BE POSSIBLE TO GET THE COVARIANCE RATE in order to estimate which cells it is that are being misclassified as which.
misclass24 <- the.data %>% filter(Day > 0, Heat == 24) %>%
filter((putida == 0 & relDen_putida > 0) |
(protegens == 0 & relDen_protegens > 0) |
(grimontii == 0 & relDen_grimontii > 0) |
(veronii == 0 & relDen_veronii > 0))
# separate the correctly called species from the species that are absent
misclass24_REAL <- misclass24 %>% mutate(relDen_putida = putida * relDen_putida,
relDen_protegens = protegens * relDen_protegens,
relDen_grimontii = grimontii * relDen_grimontii,
relDen_veronii = veronii * relDen_veronii)
misclass24 <- misclass24 %>% mutate(relDen_putida = abs(putida-1) * relDen_putida,
relDen_protegens = abs(protegens-1) * relDen_protegens,
relDen_grimontii = abs(grimontii-1) * relDen_grimontii,
relDen_veronii = abs(veronii-1) * relDen_veronii)
# pivot longer so there's a column for species
misclass24_REAL <- misclass24_REAL %>% pivot_longer(cols = relDen_putida:relDen_veronii,
values_to = "relDen",
names_to = "species",
names_prefix = "relDen_") %>%
select(uniqID, Day, community, putida, protegens, grimontii, veronii,
Total_density, relDen, species)
misclass24 <- misclass24 %>% pivot_longer(cols = relDen_putida:relDen_veronii,
values_to = "relDen",
names_to = "species",
names_prefix = "relDen_") %>%
select(uniqID, Day, community, putida, protegens, grimontii, veronii,
Total_density, relDen, species)
# remove the true species from the misclass data because these are now fake 0's
misclass24 <- misclass24[-which(misclass24$putida == 1 & misclass24$species == "putida"),]
misclass24 <- misclass24[-which(misclass24$protegens == 1 & misclass24$species == "protegens"),]
misclass24 <- misclass24[-which(misclass24$grimontii == 1 & misclass24$species == "grimontii"),]
misclass24 <- misclass24[-which(misclass24$veronii == 1 & misclass24$species == "veronii"),]
# remove the single contaminated sample
misclass24 <- misclass24[-which(misclass24$protegens == 0 & misclass24$species == "protegens" & misclass24$relDen > 0.75),]
ggplot(misclass24,
aes(x=species, y=relDen, colour=species)) +
facet_wrap(vars(Day)) +
geom_beeswarm(alpha=0.5) +
scale_colour_manual(values=species_pal_only4) +
theme(axis.text.x = element_text(angle = 90)) +
labs(y="relative density of misclassified",
title="misclassification in 24h heat for different days")
Here is a preliminary version of the data with contamination removed. I am making it so that I can use the data to fit a preliminary time series model. THIS IS NOT THE FINALIZED VERSION OF THE DATA AND WILL NEED TO BE UPDATED!!!!!
rm(contamin.df) # remove anything we may have had above.
# for now let's define contamination as >25% for something that should not be there.
contamin.df <- the.data %>% filter((putida == 0 & relDen_putida > 0.25) |
(protegens == 0 & relDen_protegens > 0.25) |
(grimontii == 0 & relDen_grimontii > 0.25) |
(veronii == 0 & relDen_veronii > 0.25))
tmp <- the.data[-which(the.data$uniqID %in% unique(contamin.df$uniqID)),]
###############
# output absolute density data for analysis
###############
# Day 0 would need to be the pre-dilution absolute densities
ggplot(tmp %>% filter(Day==0) %>% select(-uniqID, -OD_well, -Heat, -Incubator) %>% distinct(),
aes(x=Date, y=Total_density)) +
geom_beeswarm() +
labs(y="")
## Warning: Removed 15 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(tmp %>% filter(Day==0) %>% select(-uniqID, -OD_well, -Heat, -Incubator) %>% distinct(),
aes(x=Date, y=Volume)) +
geom_beeswarm() +
labs(y="")
## Warning: Removed 15 rows containing missing values or values outside the scale range
## (`geom_point()`).
# I lost the data on flow volume for Date 24-07-02.
# But we see from the plot that there's not *that* much variation between batches.
# let's just assume we had the median volume on Day 0 of 24-07-02
tmp.Day0 <- tmp[which(tmp$Day==0 & tmp$Incubator=="Epoch"),] %>% select(-uniqID, -OD_well) %>% distinct()
# get the median volume for Day 0
medianVol <- median(tmp.Day0$Volume, na.rm=TRUE)
# apply the median volume to Day 0 values from 24-07-02
tmp.Day0 <- tmp.Day0 %>% filter(Date=="24-07-02")
tmp.Day0$Volume <- medianVol
# recalculate the absolute densities for Day0
tmp.Day0 <- tmp.Day0 %>% mutate(Total_density = Total_counts/Volume,
Conc_putida = Count_putida/Volume,
Conc_protegens = Count_protegens/Volume,
Conc_grimontii = Count_grimontii/Volume,
Conc_veronii = Count_veronii/Volume)
# finally, join the estimated absolute densities for Day0 back in with the whole data
tmp.Day0.0702 <- left_join(tmp %>% filter(Day==0, Date=="24-07-02") %>% select(-Volume, -Total_density, -Conc_putida, -Conc_protegens, -Conc_grimontii, -Conc_veronii),
tmp.Day0)
## Joining with `by = join_by(Incubator, Plate, Heat, Date, Day, Heat_Day,
## Recov_Day, CommRich, community, putida, protegens, grimontii, veronii,
## Count_putida, Count_protegens, Count_grimontii, Count_veronii, FLOWplateWell,
## Total_counts, relDen_putida, relDen_protegens, relDen_grimontii,
## relDen_veronii)`
tmp <- rbind(tmp %>% filter(Date != "24-07-02"),
tmp %>% filter(Date == "24-07-02") %>% filter(Day > 0),
tmp.Day0.0702)
rm(medianVol, tmp.Day0, tmp.Day0.0702)
for(com in unique(tmp$community)) {
plot(ggplot(tmp %>% filter(community==com) %>%
select(uniqID, Heat, Day, Conc_putida, Conc_protegens, Conc_grimontii, Conc_veronii) %>%
pivot_longer(cols=starts_with("Conc"), names_to="species", names_prefix="Conc_", values_to="absDensity"),
aes(x=Day, y=absDensity, colour=species, group=paste(uniqID,Heat,species))) +
facet_grid(~Heat) +
geom_point(alpha=0.2) +
geom_line(alpha=0.5) +
scale_colour_manual(values=species_pal_only4) +
labs(title=com))
}
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 4 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 12 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 48 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 36 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1516 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1516 rows containing missing values or values outside the scale range
## (`geom_line()`).
# finally, remove known miscalled estimates from the data
tmp <- tmp %>% mutate(Conc_putida = putida * Conc_putida,
Conc_protegens = protegens * Conc_protegens,
Conc_grimontii = grimontii * Conc_grimontii,
Conc_veronii = veronii * Conc_veronii) %>%
mutate(Total_density = Conc_putida + Conc_protegens + Conc_grimontii + Conc_veronii)
# output this data to file
absDensity <- tmp %>% select(uniqID, Heat, Day, Heat_Day, Recov_Day, CommRich:veronii, Total_density:Conc_veronii)
save(absDensity, file="./TEMPORARY_absdensity_data.RData")
rm(tmp, com)
This needs to be redone using glmmTMB for easy plotting of the prediction and for trying to fit a more complex model that takes into account the effect of protegens.
# keep just the data on the last day of each time series
extinct.df <- absDensity %>% filter(Recov_Day == 2, CommRich > 0)
extinct.df <- rbind(extinct.df,
absDensity %>% filter(Heat == 0, Day == 5, CommRich > 0))
# binary vector of survival or extinction
extinct.df <- extinct.df %>% mutate(survived = ifelse(Total_density > 0, 1, 0))
### note that sample "24-07-08 Epoch G1" has missing data on Day 5 even though we know from the OD data that it survived.
extinct.df$survived[extinct.df$uniqID == "24-07-08 Epoch G1"] <- 1
# make protegens into a factor
extinct.df$protegens <- factor(extinct.df$protegens)
# fit the models
extinct.model0 <- glmmTMB(survived ~ Heat,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9415454 0.9358751 0.2974711 0.1274658 0.05014878 0.98436 0.7555976 0.4884214 0.1946789 0.8020044 0.7146549 0.9364794 0.7889983 0.6482047 0.8523613 0.2259099 0.3698037 0.713902 0.3070799 0.8362558 ...
extinct.model1 <- glmmTMB(survived ~ CommRich + Heat,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9405602 0.9332805 0.2780642 0.2218236 0.08724514 0.9881136 0.836583 0.6571041 0.3611436 0.7966241 0.7029741 0.9350192 0.7844607 0.5789393 0.8129909 0.09227307 0.2012628 0.6103148 0.5562197 0.889362 ...
extinct.model2 <- glmmTMB(survived ~ CommRich*Heat,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6300432 0.3113643 0.6617919 0.1794894 0.1656411 0.7163017 0.6280772 0.7718714 0.438437 0.4164603 0.6173576 0.5127588 0.7771592 0.4498192 0.9424153 0.08096459 0.3333525 0.7134744 0.837735 0.6455422 ...
extinct.model3 <- glmmTMB(survived ~ CommRich + Heat + protegens,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9632196 0.9562617 0.02965619 0.264864 0.1154109 0.9776571 0.8611678 0.3086776 0.4598937 0.7309843 0.8081012 0.9087348 0.7163956 0.5443066 0.7949462 0.3627286 0.08401695 0.5889396 0.65354 0.7787241 ...
extinct.model4 <- glmmTMB(survived ~ CommRich + Heat*protegens,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model4, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9632196 0.9562617 0.02965619 0.264864 0.1154109 0.9776571 0.8611678 0.3086776 0.4598937 0.7309843 0.8081012 0.9087348 0.7163956 0.5443066 0.7949462 0.3627286 0.08401695 0.5889396 0.65354 0.7787241 ...
anova(extinct.model0, extinct.model1)
anova(extinct.model0, extinct.model2)
anova(extinct.model1, extinct.model2)
AIC(extinct.model0, extinct.model1, extinct.model2, extinct.model3, extinct.model4) %>% arrange(AIC)
BIC(extinct.model0, extinct.model1, extinct.model2, extinct.model3, extinct.model4) %>% arrange(BIC)
summary(extinct.model3)
## Family: binomial ( logit )
## Formula: survived ~ CommRich + Heat + protegens
## Data: extinct.df
##
## AIC BIC logLik deviance df.resid
## 62.1 77.2 -27.0 54.1 320
##
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.688e+00 1.512e+00 3.763 0.000168 ***
## CommRich 5.080e-01 5.035e-01 1.009 0.312982
## Heat -1.445e-01 3.096e-02 -4.666 3.07e-06 ***
## protegens1 2.305e+01 1.592e+04 0.001 0.998844
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(extinct.model4)
## Family: binomial ( logit )
## Formula: survived ~ CommRich + Heat * protegens
## Data: extinct.df
##
## AIC BIC logLik deviance df.resid
## 64.1 83.0 -27.0 54.1 319
##
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.688e+00 1.512e+00 3.763 0.000168 ***
## CommRich 5.080e-01 5.035e-01 1.009 0.312985
## Heat -1.445e-01 3.096e-02 -4.666 3.07e-06 ***
## protegens1 1.836e+01 3.096e+04 0.001 0.999527
## Heat:protegens1 1.386e-01 1.189e+03 0.000 0.999907
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
extinct_predict <- cbind(extinct.model3$frame,
predict(extinct.model3, type="response"))
colnames(extinct_predict)[c(1,4:5)] <- c("observed", "protegens", "predicted")
# plot the predictions against the data
plot(ggplot(extinct_predict,
aes(x=as.factor(Heat),
y=observed,
colour=as.factor(CommRich),
group=as.factor(CommRich))) +
facet_grid(~protegens) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.5, size=0.8, width=0.1, height = 0.2) +
geom_line(aes(y = predicted)) +
scale_y_continuous(breaks = c(0, 1)) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="(CommRich as numeric)", x="Heat duration (hrs)",
y="Growth in well on last day?", colour="Inoculated\nRichness"))
# plot the effect sizes of the preferred model
extinct_forplot <- data.frame(confint(extinct.model3))
colnames(extinct_forplot)[1:2] <- c("loCI", "hiCI")
extinct_forplot$predictor <- as.factor(rownames(extinct_forplot))
ggplot(extinct_forplot,
aes(x = Estimate, y = predictor)) +
geom_vline(xintercept = 0, colour="grey") +
geom_point()
# the estimates for this model are rather puzzling
ggplot(extinct_forplot,
aes(x = Estimate, y = predictor)) +
geom_vline(xintercept = 0, colour="grey") +
geom_point() +
geom_errorbarh(aes(xmin = loCI, xmax = hiCI), height=0)
# Note that community richness has a non-linear effect on survival
# when we run the model again considering inoculated richness as unordered factors
extinct.df$CommRich <- factor(extinct.df$CommRich, ordered = FALSE)
extinct.model3_categ <- glmmTMB(survived ~ CommRich + Heat + protegens,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model3_categ, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9520541 0.9466244 0.02965619 0.2963166 0.1229676 0.9776571 0.8915374 0.3086776 0.5276082 0.7309843 0.7530346 0.9087348 0.7163956 0.5443066 0.7949462 0.5313656 0.08401695 0.5889396 0.7469674 0.7787241 ...
extinct.model4_categ <- glmmTMB(survived ~ CommRich + Heat*protegens,
data = extinct.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinct.model4_categ, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.745615 0.5517985 0.6038137 0.2683341 0.5621223 0.01228703 0.895571 0.5916677 0.06399853 0.8125447 0.7521971 0.5345887 0.5310096 0.165366 0.9046134 0.4078037 0.6056507 0.2864387 0.8671172 0.06874163 ...
# plot the effect sizes of the categorical model
extinct_forplot <- data.frame(confint(extinct.model3_categ))
colnames(extinct_forplot)[1:2] <- c("loCI", "hiCI")
extinct_forplot$predictor <- as.factor(rownames(extinct_forplot))
ggplot(extinct_forplot,
aes(x = Estimate, y = predictor)) +
geom_vline(xintercept = 0, colour="grey") +
geom_point()
# it seems as though both CommRich = 4 and the presence of protegens prevent extinction. But this is more likely to be explained by the fact that CommRich is correlated with the presence of protegens...
# but this model is not preferred over the simpler one where CommRich is numeric
anova(extinct.model3, extinct.model3_categ)
AIC(extinct.model3, extinct.model3_categ, extinct.model4_categ) %>% arrange(AIC)
BIC(extinct.model3, extinct.model3_categ, extinct.model4_categ) %>% arrange(BIC)
# create data.frame for plotting
extinct_predict <- cbind(extinct.model3_categ$frame,
predict(extinct.model3_categ, type="response"))
colnames(extinct_predict)[c(1,4:5)] <- c("observed", "protegens", "predicted")
# plot the predictions against the data
plot(ggplot(extinct_predict,
aes(x=as.factor(Heat),
y=observed,
colour=as.factor(CommRich),
group=as.factor(CommRich))) +
facet_grid(~protegens) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.5, size=0.8, width=0.1, height = 0.2) +
geom_line(aes(y = predicted)) +
scale_y_continuous(breaks = c(0, 1)) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="(CommRich as categorical)", x="Heat duration (hrs)",
y="Growth in well on last day?", colour="Inoculated\nRichness"))
# Note that it's important to take into account survival in Shannon diversity:
extinct.df$Diversity <- diversity(extinct.df[,c("Conc_putida", "Conc_protegens", "Conc_grimontii", "Conc_veronii")])
ggplot(extinct.df %>% filter(survived == 1, CommRich != 1),
aes(x=as.factor(Heat), y=Diversity, colour=as.factor(CommRich))) +
facet_grid(~ protegens) +
geom_jitter(alpha=0.4, width=0.3) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(title="Diversity on last day (protegens present?)", x="Heat duration (hrs)",
y="Shannon diversity on last day", colour="Inoculated\nRichness")
# clean up
rm(extinct.model0, extinct.model1, extinct.model2, extinct.model3_categ, extinct.model3, extinct.model4, extinct_predict, extinct_forplot)
I was surprised when I saw the 48h data for P. putida. It is the fastest grower, the most abundant species in communities without P. protegens, and rather insensitive to \(40^\circ C\). But it seems to me like it is more likely to go extinct when it is found in co-culture with the two slow growing species than when it is found alone. Let’s check if this effect is statistically significant.
# keep just the communities with putida and only at 48h heat
extinct_putida.df <- extinct.df %>% filter(putida == 1, protegens == 0, Heat == 48)
# remember that inoculated community richness (CommRich) is a factor now
# run the binomial glm for just the putida data at 48h and an effect of inoculated diversity
extinctPutida.model_categ <- glmmTMB(survived ~ CommRich,
data = extinct_putida.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinctPutida.model_categ, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.2697141 0.5387343 0.5906688 0.6997162 0.05898375 0.4893786 0.005021494 0.06557116 0.5608866 0.7940033 0.8580841 0.2444507 0.02787849 0.7405232 0.4996581 0.3666084 0.09296374 0.8083572
# convert inoculated community richness back into a numeric vector using the hacky method
extinct_putida.df$CommRich <- as.numeric(as.character(extinct_putida.df$CommRich))
# run the model again now with CommRich as numeric
extinctPutida.model_numeric <- glmmTMB(survived ~ CommRich,
data = extinct_putida.df,
family = binomial,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = extinctPutida.model_numeric, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.3699494 0.5889239 0.540591 0.7462602 0.05141364 0.4341051 0.007674358 0.09835673 0.4888174 0.772265 0.876467 0.3313388 0.01151503 0.7724375 0.4437947 0.448231 0.07960983 0.7107936
# check which one is preferred
anova(extinctPutida.model_categ, extinctPutida.model_numeric)
AIC(extinctPutida.model_categ, extinctPutida.model_numeric) %>% arrange(AIC)
BIC(extinctPutida.model_categ, extinctPutida.model_numeric) %>% arrange(BIC)
## okay so they are statistically indistinguishable according to anova and the numeric is slightly preferred by BIC (fewer df)
# check for significant effect of diversity
summary(extinctPutida.model_numeric)
## Family: binomial ( logit )
## Formula: survived ~ CommRich
## Data: extinct_putida.df
##
## AIC BIC logLik deviance df.resid
## 24.0 25.8 -10.0 20.0 16
##
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.3994 1.9004 1.789 0.0737 .
## CommRich -1.3240 0.8563 -1.546 0.1221
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# plot the effect sizes of the preferred model
extinct_forplot <- data.frame(confint(extinctPutida.model_numeric))
colnames(extinct_forplot)[1:2] <- c("loCI", "hiCI")
extinct_forplot$predictor <- as.factor(rownames(extinct_forplot))
ggplot(extinct_forplot,
aes(x = Estimate, y = predictor)) +
geom_vline(xintercept = 0, colour="grey") +
geom_point() +
geom_errorbarh(aes(xmin = loCI, xmax = hiCI), height=0)
# create data.frame for plotting
extinct_predict <- cbind(extinctPutida.model_numeric$frame,
predict(extinctPutida.model_numeric, type="response"))
colnames(extinct_predict)[c(1,3)] <- c("observed", "predicted")
# plot the predictions against the data
plot(ggplot(extinct_predict,
aes(x=CommRich,
y=observed)) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.5, size=2, width=0.15, height = 0.05) +
geom_line(aes(y = predicted)) +
scale_y_continuous(breaks = c(0, 1)) +
scale_x_continuous(breaks = 1:3) +
labs(title="effect of community on putida survival", y="Growth in well on last day?", x="inoculated richness"))
# clean up
rm(extinct_putida.df, extinctPutida.model_categ, extinctPutida.model_numeric, extinct_forplot, extinct_predict)
#############################################
# for the code below to run properly,
#############################################
# Heat needs to be a factor
extinct.df$Heat <- factor(extinct.df$Heat,
levels = c("0", "6", "12", "24", "48"))
levels(extinct.df$Heat)[1] <- "control"
# and we need to empirically estimate the probability of extinction for each community across all heat treatments
community_extinction_probs <- extinct.df %>% group_by(community, CommRich) %>% summarise(surviv_prob = sum(survived)/n()) %>% arrange(CommRich)
## `summarise()` has grouped output by 'community'. You can override using the
## `.groups` argument.
# a vector of the risk-prone communities
extict_prone_comms <- community_extinction_probs$community[which(community_extinction_probs$surviv_prob < 1)]
# a vector of the communities that never went extinct
never_extinct_comms <- community_extinction_probs$community[which(community_extinction_probs$surviv_prob == 1)]
rm(community_extinction_probs)
We will analyze the diversity and the productivity (AKA total
density) to understand how they change relative to the no heat control
during the resistance and the recovery period. The diversity is easier
to deal with because we can use the Shannon diversity calculation as
implemented by vegan.
For the total abundances, this data needs to be log transformed.
Unfortunately, we do have extinction and NA events in this data. To deal
with this issue, I use the \(x +
\epsilon\) transformation in both the numerator and demoninator
before calculating the effect size as \(\frac{\textrm{total density}_{heat}}{\textrm{total
density}_{control}}\). As for
timeseries--4Nov24.Rmd, we define \(\epsilon\) as 0.25 * the threshold of
detection for the flow cytometer, 50 total fluorescent events in \(146\mu L\).
The simplest hypothesis is that heat duration, inoculated richness, and their interaction explain the diversity during recovery and resistance.
A slightly more complex hypothesis from the thermal performance curve
data (Fig. 1) would be that any species that can withstand heat should
have higher productivity during heat, even long duration heat.
Therefore, I created another model where I replaced only
putida with the predictor withstands_heat
(because this is the information we know from Fig. 1) and also kept heat
duration and inoculated richness as predictors, as well as their
interaction.
Finally, a hypothesis that emerges from looking at the time series
data itself is that putida is a strong competitor. This is a similar
type of model as “withstands heat” model above, but for the presence of
protegens.
absDensity$Diversity <- diversity(absDensity[,c("Conc_putida", "Conc_protegens", "Conc_grimontii", "Conc_veronii")])
# first let's remove the empty wells as we won't need them anymore
absDensity <- absDensity %>% filter(community != "0_0_0_0")
# Note that there are many 0 and NA values for Total_density
summary(absDensity$Total_density)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.00 47.03 100.25 219.98 220.90 2674.00 72
# 0's are communities that went extinct altogether and never recovered
absDensity[which(absDensity$Total_density == 0),]
# most NA's are communities below the threshold of detection during heat that later perhaps recovered
absDensity[which(is.na(absDensity$Total_density) & absDensity$Heat>12),]
# other NA's are just missing data (e.g., due to flow cytometry clogs or just plain pipetting mistakes)
absDensity[which(is.na(absDensity$Total_density) & absDensity$Heat<12),]
# the total density data will have to be slightly adjusted for fitting to the models
absDen_forFit <- absDensity %>% filter(Day > 0)
# for the "raw" total density data that will be fitted via negative binomial GLM,
# keep the 0's in the data
# but convert NA's into epsilon values (where epsilon is just below the threshold of detection)
below_threshold_rows <- which(is.na(absDen_forFit$Total_density) & absDen_forFit$Heat>12)
absDen_forFit$Total_density[below_threshold_rows] <- (0.25*50/146)
# for the transformed total density data, apply (x + epsilon) transformation to all values EXCEPT those that used to be NA's
absDen_forFit$TotDen_plusEpsilon <- absDen_forFit$Total_density
absDen_forFit$TotDen_plusEpsilon[-below_threshold_rows] <- absDen_forFit$TotDen_plusEpsilon[-below_threshold_rows] + (0.25*50/146)
rm(below_threshold_rows)
plot(ggplot(absDensity %>% filter(Day!=0,
!is.na(CommRich)),
aes(x=Day,
y=Diversity,
colour=as.factor(CommRich),
group=paste(uniqID,Heat))) +
facet_grid(~Heat) +
geom_point(alpha=0.2) +
geom_line(alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="All data", y="Shannon Diversity",
colour="Inoculated\nRichness"))
## Warning: Removed 16 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
# this is not very useful to look at. Let's convert to effect size
control_trtmt <- absDen_forFit %>% filter(Heat==0) %>%
select(-uniqID, -Heat, -putida, -protegens, -grimontii, -veronii) %>%
rename(ctrl_Total_density = Total_density,
ctrl_TotDen_plusEpsilon = TotDen_plusEpsilon,
ctrl_Conc_putida = Conc_putida,
ctrl_Conc_protegens = Conc_protegens,
ctrl_Conc_grimontii = Conc_grimontii,
ctrl_Conc_veronii = Conc_veronii,
ctrl_Diversity = Diversity)
effectSize <- left_join(absDen_forFit %>% filter(Heat!=0),
control_trtmt,
by=c("Day", "community")) %>%
mutate(Total_density_frac = Total_density / ctrl_Total_density,
TotDen_plusEpsilon_frac = TotDen_plusEpsilon / ctrl_TotDen_plusEpsilon,
Conc_putida_frac = Conc_putida / ctrl_Conc_putida,
Conc_protegens_frac = Conc_protegens / ctrl_Conc_protegens,
Conc_grimontii_frac = Conc_grimontii / ctrl_Conc_grimontii,
Conc_veronii_frac = Conc_veronii / ctrl_Conc_veronii,
Diversity_diff = Diversity - ctrl_Diversity) %>%
group_by(Day, Heat, community, uniqID) %>%
summarise(Total_density_mean = mean(Total_density_frac, na.rm = TRUE),
Total_density_sd = sd(Total_density_frac, na.rm = TRUE),
TotDen_plusEpsilon_mean = mean(TotDen_plusEpsilon_frac, na.rm = TRUE),
TotDen_plusEpsilon_sd = sd(TotDen_plusEpsilon_frac, na.rm = TRUE),
Conc_putida_mean = mean(Conc_putida_frac, na.rm = TRUE),
Conc_putida_sd = sd(Conc_putida_frac, na.rm = TRUE),
Conc_protegens_mean = mean(Conc_protegens_frac, na.rm = TRUE),
Conc_protegens_sd = sd(Conc_protegens_frac, na.rm = TRUE),
Conc_grimontii_mean = mean(Conc_grimontii_frac, na.rm = TRUE),
Conc_grimontii_sd = sd(Conc_grimontii_frac, na.rm = TRUE),
Conc_veronii_mean = mean(Conc_veronii_frac, na.rm = TRUE),
Conc_veronii_sd = sd(Conc_veronii_frac, na.rm = TRUE),
Diversity_mean = mean(Diversity_diff, na.rm = TRUE),
Diversity_sd = sd(Diversity_diff, na.rm = TRUE))
## Warning in left_join(absDen_forFit %>% filter(Heat != 0), control_trtmt, : Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 3 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
## `summarise()` has grouped output by 'Day', 'Heat', 'community'. You can
## override using the `.groups` argument.
rm(control_trtmt)
# add the annotation back to the data
effectSize <- full_join(effectSize,
annotated.rawdata %>%
unite("community", putida:veronii, remove = FALSE) %>%
ungroup() %>% select(Heat, Day:veronii) %>%
filter(Day!=0) %>% distinct(),
by=c("community", "Heat", "Day")) %>%
filter(community != "0_0_0_0", Heat > 0)
# adjust the variable types for plotting and analysis
effectSize$Heat <- factor(effectSize$Heat, ordered = TRUE)
effectSize$putida <- as.logical(effectSize$putida)
effectSize$protegens <- as.logical(effectSize$protegens)
effectSize$grimontii <- as.logical(effectSize$grimontii)
effectSize$veronii <- as.logical(effectSize$veronii)
# create a column indicating the last heat day for each heat treatment
effectSize$Last_Heat_Day <- FALSE
effectSize$Last_Heat_Day[which(effectSize$Heat==6 & effectSize$Day==1)] <- TRUE
effectSize$Last_Heat_Day[which(effectSize$Heat==12 & effectSize$Day==2)] <- TRUE
effectSize$Last_Heat_Day[which(effectSize$Heat==24 & effectSize$Day==2)] <- TRUE
effectSize$Last_Heat_Day[which(effectSize$Heat==48 & effectSize$Day==3)] <- TRUE
# consider heat as an unordered factor
effectSize$Heat <- factor(effectSize$Heat, ordered = FALSE)
# create a column indicating presence of species that don't die during heat
effectSize$withstands_heat <- FALSE
effectSize$withstands_heat[which(effectSize$putida == TRUE)] <- TRUE
## now we can plot effect size of diversity
plot(ggplot(effectSize,
aes(x=Day,
y=Diversity_mean,
colour=as.factor(CommRich),
group=community)) +
geom_hline(yintercept = 0) +
facet_grid(~Heat) +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
stat_summary(fun=mean, geom="line", alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="All data", colour="Inoculated\nRichness",
y="Effect size on Shannon Diversity"))
## Warning: Removed 64 rows containing non-finite outside the scale range
## (`stat_summary()`).
## Warning: Removed 64 rows containing missing values or values outside the scale range
## (`geom_point()`).
plot(ggplot(effectSize %>%
filter(protegens==1),
aes(x=Day,
y=Diversity_mean,
colour=as.factor(CommRich),
group=community)) +
geom_hline(yintercept = 0) +
facet_grid(~Heat) +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
stat_summary(fun=mean, geom="line", alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Including protegens", colour="Inoculated\nRichness",
y="Effect size on Shannon Diversity"))
## Warning: Removed 24 rows containing non-finite outside the scale range
## (`stat_summary()`).
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_point()`).
plot(ggplot(effectSize %>%
filter(protegens==0),
aes(x=Day,
y=Diversity_mean,
colour=as.factor(CommRich),
group=community)) +
geom_hline(yintercept = 0) +
facet_grid(~Heat) +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
stat_summary(fun=mean, geom="line", alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Excluding protegens", colour="Inoculated\nRichness",
y="Effect size on Shannon Diversity"))
## Warning: Removed 40 rows containing non-finite outside the scale range
## (`stat_summary()`).
## Warning: Removed 40 rows containing missing values or values outside the scale range
## (`geom_point()`).
print("Which communities correspond to the outlier with increased diversity in the absense of protegens?")
## [1] "Which communities correspond to the outlier with increased diversity in the absense of protegens?"
effectSize[which(effectSize$CommRich > 1 & effectSize$protegens==0 & effectSize$Diversity_mean > 0),] %>% select(Day, Heat, community) %>% distinct %>% arrange(Heat, Day)
# RESISTANCE: let's check for statistically significant effect
effectSize$CommRich <- factor(effectSize$CommRich)
resist_divers0 <- with(effectSize %>% filter(Last_Heat_Day == TRUE),
lm(Diversity_mean ~ CommRich + Heat + CommRich:Heat))
# Maddy's preferred model
print("RESISTANCE DIVERSITY DATA. SUMMARY OF THE SIMPLEST MODEL:")
## [1] "RESISTANCE DIVERSITY DATA. SUMMARY OF THE SIMPLEST MODEL:"
summary(resist_divers0)
##
## Call:
## lm(formula = Diversity_mean ~ CommRich + Heat + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.5832 0.0000 0.0046 0.1269 0.4072
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.232e-16 4.460e-02 0.000 1.00000
## CommRich2 -1.861e-01 5.839e-02 -3.187 0.00168 **
## CommRich3 -1.875e-01 6.307e-02 -2.972 0.00334 **
## CommRich4 2.449e-03 9.973e-02 0.025 0.98043
## Heat12 -9.235e-17 7.283e-02 0.000 1.00000
## Heat24 -4.257e-19 7.487e-02 0.000 1.00000
## Heat48 1.013e-18 1.235e-01 0.000 1.00000
## CommRich2:Heat12 9.954e-02 9.054e-02 1.099 0.27299
## CommRich3:Heat12 5.063e-02 9.689e-02 0.523 0.60190
## CommRich4:Heat12 -1.093e-02 1.457e-01 -0.075 0.94024
## CommRich2:Heat24 5.926e-02 9.191e-02 0.645 0.51985
## CommRich3:Heat24 1.318e-01 1.012e-01 1.302 0.19461
## CommRich4:Heat24 1.473e-02 1.533e-01 0.096 0.92356
## CommRich2:Heat48 1.816e-01 1.526e-01 1.190 0.23556
## CommRich3:Heat48 7.373e-02 1.649e-01 0.447 0.65525
## CommRich4:Heat48 -1.030e-02 2.510e-01 -0.041 0.96729
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1995 on 192 degrees of freedom
## (59 observations deleted due to missingness)
## Multiple R-squared: 0.1253, Adjusted R-squared: 0.05691
## F-statistic: 1.833 on 15 and 192 DF, p-value: 0.03265
resist_divers1 <- with(effectSize %>% filter(Last_Heat_Day == TRUE),
lm(Diversity_mean ~ CommRich + withstands_heat + Heat + CommRich:withstands_heat + Heat:withstands_heat + CommRich:Heat))
resist_divers2 <- with(effectSize %>% filter(Last_Heat_Day == TRUE),
lm(Diversity_mean ~ CommRich + protegens + Heat + CommRich:protegens + Heat:protegens + CommRich:Heat))
# let's compare the simpler model with the more complex one
anova(resist_divers0, resist_divers1)
anova(resist_divers0, resist_divers2)
AIC(resist_divers0, resist_divers1, resist_divers2) %>% arrange(AIC)
BIC(resist_divers0, resist_divers1, resist_divers2) %>% arrange(BIC)
# the statistically preferred model
print("")
## [1] ""
print("RESISTANCE DIVERSITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:")
## [1] "RESISTANCE DIVERSITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:"
print("")
## [1] ""
summary(resist_divers2)
##
## Call:
## lm(formula = Diversity_mean ~ CommRich + protegens + Heat + CommRich:protegens +
## Heat:protegens + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.29246 -0.03143 0.00102 0.02250 0.54282
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.02250 0.02877 -0.782 0.43506
## CommRich2 -0.34708 0.03893 -8.916 4.46e-16 ***
## CommRich3 -0.66022 0.04932 -13.385 < 2e-16 ***
## CommRich4 -0.06506 0.06810 -0.955 0.34063
## protegensTRUE 0.09002 0.04472 2.013 0.04555 *
## Heat12 0.03613 0.04648 0.777 0.43795
## Heat24 0.05393 0.04832 1.116 0.26579
## Heat48 0.07611 0.10012 0.760 0.44810
## CommRich2:protegensTRUE 0.27691 0.04616 5.999 1.01e-08 ***
## CommRich3:protegensTRUE 0.57034 0.05288 10.785 < 2e-16 ***
## CommRich4:protegensTRUE NA NA NA NA
## protegensTRUE:Heat12 -0.12273 0.04555 -2.694 0.00770 **
## protegensTRUE:Heat24 -0.15916 0.04628 -3.439 0.00072 ***
## protegensTRUE:Heat48 -0.17042 0.10596 -1.608 0.10945
## CommRich2:Heat12 0.11121 0.05488 2.026 0.04417 *
## CommRich3:Heat12 0.08532 0.06118 1.395 0.16480
## CommRich4:Heat12 0.07566 0.09185 0.824 0.41118
## CommRich2:Heat24 0.08491 0.05558 1.528 0.12831
## CommRich3:Heat24 0.17213 0.06365 2.704 0.00748 **
## CommRich4:Heat24 0.11996 0.09608 1.249 0.21340
## CommRich2:Heat48 0.09248 0.10028 0.922 0.35762
## CommRich3:Heat48 0.12544 0.10058 1.247 0.21390
## CommRich4:Heat48 0.08401 0.15555 0.540 0.58978
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1185 on 186 degrees of freedom
## (59 observations deleted due to missingness)
## Multiple R-squared: 0.7006, Adjusted R-squared: 0.6668
## F-statistic: 20.73 on 21 and 186 DF, p-value: < 2.2e-16
plot(resist_divers2)
## Warning: not plotting observations with leverage one:
## 208
# plot Maddy's preferred model
plot(ggplot(effectSize %>%
filter(Last_Heat_Day == TRUE),
aes(x=Heat,
y=Diversity_mean,
colour=CommRich,
group=CommRich)) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.4, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Resistance", x="Inoculated Richness",
y="Effect size on Shannon Diversity", colour="Inoculated\nRichness"))
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 59 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 59 rows containing missing values or values outside the scale range
## (`geom_point()`).
# plot the statistically preferred model
plot(ggplot(effectSize %>%
filter(Last_Heat_Day == TRUE),
aes(x=Heat,
y=Diversity_mean,
colour=CommRich,
group=CommRich)) +
facet_grid(~putida) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.4, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Resistance (protegens present?)", x="Inoculated Richness",
y="Effect size on Shannon Diversity", colour="Inoculated\nRichness"))
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 59 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Removed 59 rows containing missing values or values outside the scale range
## (`geom_point()`).
print("Which communities have missing values on the last day of heat?")
## [1] "Which communities have missing values on the last day of heat?"
effectSize[which(is.na(effectSize$Diversity_mean) & effectSize$CommRich > 1),] %>% select(Day, Heat, community) %>% distinct %>% arrange(Heat, Day)
## Warning in Ops.factor(effectSize$CommRich, 1): '>' not meaningful for factors
# clean up
rm(resist_divers0, resist_divers1, resist_divers2)
# RECOVERY: let's check for statistically significant effect
recov_divers0 <- with(effectSize %>% filter(Recov_Day == 2),
lm(Diversity_mean ~ CommRich + Heat + CommRich:Heat))
# summarize Maddy's preferred model
print("RECOVERY DIVERSITY DATA. SUMMARY OF THE SIMPLEST MODEL:")
## [1] "RECOVERY DIVERSITY DATA. SUMMARY OF THE SIMPLEST MODEL:"
summary(recov_divers0)
##
## Call:
## lm(formula = Diversity_mean ~ CommRich + Heat + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.35900 -0.00272 0.00000 0.07038 0.49123
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.281e-16 3.087e-02 0.000 1.0000
## CommRich2 -4.802e-02 4.013e-02 -1.197 0.2326
## CommRich3 -1.004e-01 4.366e-02 -2.300 0.0223 *
## CommRich4 6.332e-03 6.904e-02 0.092 0.9270
## Heat12 -2.454e-16 5.042e-02 0.000 1.0000
## Heat24 -1.388e-16 4.423e-02 0.000 1.0000
## Heat48 -2.639e-16 4.486e-02 0.000 1.0000
## CommRich2:Heat12 2.720e-03 6.249e-02 0.044 0.9653
## CommRich3:Heat12 3.362e-02 6.707e-02 0.501 0.6166
## CommRich4:Heat12 -6.829e-03 1.008e-01 -0.068 0.9461
## CommRich2:Heat24 8.304e-03 5.740e-02 0.145 0.8851
## CommRich3:Heat24 1.742e-02 6.466e-02 0.269 0.7878
## CommRich4:Heat24 1.039e-01 1.026e-01 1.012 0.3126
## CommRich2:Heat48 -8.343e-02 5.834e-02 -1.430 0.1539
## CommRich3:Heat48 -5.629e-03 6.448e-02 -0.087 0.9305
## CommRich4:Heat48 -5.780e-03 1.029e-01 -0.056 0.9553
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1381 on 251 degrees of freedom
## Multiple R-squared: 0.1133, Adjusted R-squared: 0.06035
## F-statistic: 2.139 on 15 and 251 DF, p-value: 0.008885
recov_divers1 <- with(effectSize %>% filter(Recov_Day == 2),
lm(Diversity_mean ~ CommRich + withstands_heat + Heat + CommRich:withstands_heat + Heat:withstands_heat + CommRich:Heat))
recov_divers2 <- with(effectSize %>% filter(Recov_Day == 2),
lm(Diversity_mean ~ CommRich + protegens + Heat + CommRich:protegens + Heat:protegens + CommRich:Heat))
# let's compare the simpler model with the more complex one
anova(recov_divers0, recov_divers1)
anova(recov_divers0, recov_divers2)
AIC(recov_divers0, recov_divers1, recov_divers2) %>% arrange(AIC)
BIC(recov_divers0, recov_divers1, recov_divers2) %>% arrange(BIC)
# summarize the statistically preferred model
print("")
## [1] ""
print("RECOVERY DIVERSITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:")
## [1] "RECOVERY DIVERSITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:"
print("")
## [1] ""
summary(recov_divers2)
##
## Call:
## lm(formula = Diversity_mean ~ CommRich + protegens + Heat + CommRich:protegens +
## Heat:protegens + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.23369 -0.01736 -0.00593 0.01935 0.40779
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.008324 0.021549 0.386 0.699610
## CommRich2 -0.112689 0.028728 -3.923 0.000114 ***
## CommRich3 -0.400014 0.036030 -11.102 < 2e-16 ***
## CommRich4 0.031305 0.050668 0.618 0.537256
## protegensTRUE -0.033297 0.031338 -1.063 0.289044
## Heat12 -0.002391 0.035111 -0.068 0.945757
## Heat24 -0.001645 0.030070 -0.055 0.956422
## Heat48 -0.028317 0.030539 -0.927 0.354715
## CommRich2:protegensTRUE 0.142226 0.029544 4.814 2.59e-06 ***
## CommRich3:protegensTRUE 0.421654 0.035175 11.987 < 2e-16 ***
## CommRich4:protegensTRUE NA NA NA NA
## protegensTRUE:Heat12 0.019058 0.034351 0.555 0.579534
## protegensTRUE:Heat24 0.007915 0.034161 0.232 0.816962
## protegensTRUE:Heat48 0.105271 0.034116 3.086 0.002264 **
## CommRich2:Heat12 -0.009650 0.041420 -0.233 0.815969
## CommRich3:Heat12 0.005638 0.046257 0.122 0.903089
## CommRich4:Heat12 -0.023496 0.069503 -0.338 0.735607
## CommRich2:Heat24 0.007869 0.038302 0.205 0.837391
## CommRich3:Heat24 -0.006680 0.045684 -0.146 0.883858
## CommRich4:Heat24 0.097582 0.071404 1.367 0.173000
## CommRich2:Heat48 -0.114108 0.038991 -2.926 0.003750 **
## CommRich3:Heat48 -0.056265 0.045082 -1.248 0.213200
## CommRich4:Heat48 -0.082734 0.071469 -1.158 0.248145
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.08977 on 245 degrees of freedom
## Multiple R-squared: 0.6341, Adjusted R-squared: 0.6028
## F-statistic: 20.22 on 21 and 245 DF, p-value: < 2.2e-16
plot(recov_divers2)
# plot Maddy's preferred model
plot(ggplot(effectSize %>%
filter(Recov_Day == 2),
aes(x=Heat,
y=Diversity_mean,
colour=CommRich,
group=CommRich)) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.4, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Recovery", colour="Inoculated\nRichness",
y="Effect size on Shannon Diversity", x="Heat Duration (hrs)"))
## `geom_smooth()` using formula = 'y ~ x'
# plot the preferred model
plot(ggplot(effectSize %>%
filter(Recov_Day == 2),
aes(x=Heat,
y=Diversity_mean,
colour=CommRich,
group=CommRich)) +
facet_grid(~putida) +
geom_hline(yintercept = 0, colour="grey") +
geom_jitter(alpha=0.4, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(title="Recovery (protegens present?)", colour="Inoculated\nRichness",
y="Effect size on Shannon Diversity", x="Heat Duration (hrs)"))
## `geom_smooth()` using formula = 'y ~ x'
# clean up
rm(recov_divers0, recov_divers1, recov_divers2)
This summarizes diversity at the end of resistance and diversity at the end of recovery.
When protegens is present, heat has no effect on diversity. This is because protegens gets fixed in those communities.
When protegens is absent: heat has a negative effect on diversity. EXCEPT for the community composed of the 2 slower growing species (grimontii & veronii) for heat events that are not yet long enough to drive them to extinction.
During resistance, inoculated community richness has an overall negative effect on diversity. This is because the presence of protegens – like the presence of putida – has an overall negative effect on diversity. However, when we take into account this protegens effect, inoculated community richness in fact has a positive effect on diversity. I’m surprised that heat duration on its own does not have a significant effect: this is probably because there are a lot of missing values for 48h of heat. (This model is teetering on the edge of reasonable: 22 estimated parameters on 162 observations.)
Recovery is consistent with resistance. The only difference is that we finally see significant effects of heat duration. Heat duration on its own has a negative effect. This becomes slightly positive when the protegens effect is taken into account. (This model is teetering on the edge of reasonable: 22 estimated parameters on 184 observations.)
BUT!!!! It seems that the linear model is not a good fit to this data. Consider using a GLM with a distribution that can have many 0 values.
Maddy and Gerard suggest that I use the full model to estimate effect
size then emmeans to estimate the effect size post-hoc. I’m tailouring
this analysis on the example script that Nico sent me
(Script_simplified for Hermina.R).
One unique attribute of my experimental design is that the Day used as control differs with heat duration (e.g., last day of recovery for 6h heat duration is Day 3 but last day of recovery for 48h is Day 5). One possible solution for this is to run separate models for each heat treatment with its respective controls (i.e., 4 models in total). To make sure that the effect sizes will be directly comparable across the models (especially with respect to the standard deviation), Gerard suggested that I scale the whole data prior to splitting it up into 4 (but not centering it as that will give me negative values that I can’t really use a ). Finally, if/when testing for significance it will then be necessary to control for multiple testing (e.g., using a Bonferroni correction).
Note that for diversity I am considering CommRich as a numeric
(instead of as a factor). Initially I tried playing around with CommRich
as an ordered & unordered factor. But I found that
glmmTMB was choosing to drop different predictors because
it was upset that my model was overparameterized. This was particularly
annoying as it was dropping the estimates for the control
treatments…
# remove the monocultures from the data
diversity_forFit <- absDen_forFit %>% filter(CommRich > 1) %>% # diversity is nonsense for monocultures
select(-Total_density, -TotDen_plusEpsilon,
-Conc_putida, -Conc_protegens, -Conc_grimontii, -Conc_veronii)
# make heat into a factor with 0 as control
diversity_forFit$Heat <- factor(diversity_forFit$Heat,
levels = c("0", "6", "12", "24", "48"))
levels(diversity_forFit$Heat)[1] <- "control"
# add a column indicating whether the replicate survived
diversity_forFit <- inner_join(diversity_forFit,
extinct.df %>% select(uniqID, Heat, survived),
by = c("uniqID", "Heat"))
# keep only the data where wells did NOT go extinct (in this case diversity is 0 but this is trivial)
#diversity_forFit <- diversity_forFit %>% filter(survived == 1)
# scale the data by its standard deviation
diversity_forFit$Diversity_scale <- scale(diversity_forFit$Diversity,
scale = sd(diversity_forFit$Diversity, na.rm = TRUE),
center = FALSE)
# the max re-scaled value is 5.38 and 38% of the data is 0's
# so try gamma and lognormal distributions (maybe also Gaussian just to check that it's a bad fit?)
summary(diversity_forFit$Diversity_scale)
## V1
## Min. :0.00000
## 1st Qu.:0.00000
## Median :0.01955
## Mean :0.45128
## 3rd Qu.:0.19557
## Max. :5.37608
## NA's :12
sum(diversity_forFit$Diversity_scale == 0, na.rm = TRUE) / length(diversity_forFit$Diversity_scale)
## [1] 0.3850868
# set CommRich to unordered factors
#diversity_forFit$CommRich <- factor(diversity_forFit$CommRich, ordered = FALSE)
# setting Day as a factor (either ordered or unordered) leads to overfitting
#diversity_forFit$Day <- factor(diversity_forFit$Day, ordered = FALSE)
# !!! emmeans expects the control to be the very *last* level !!!
diversity_forFit$Heat <- factor(diversity_forFit$Heat,
levels = c("6", "12", "24", "48", "control"))
# let's compare different GLM families
try_gaussian <- glmmTMB(Diversity_scale ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_gaussian, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 1 0.196 0.26 0.992 1 0.2 0.208 0.944 0.208 0.248 1 0.18 0.26 0.224 1 0.212 0.212 0.244 1 0.256 ...
try_gamma <- glmmTMB(Diversity_scale ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = ziGamma,
ziformula = ~1, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
simulateResiduals(fittedModel = try_gamma, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.952 0.42 0.3568358 0.944 0.94 0.3401457 0.3068186 0.88 0.58 0.1703229 0.988 0.456 0.612 0.512 0.964 0.1434111 0.532 0.428 0.984 0.424 ...
try_lognorm <- glmmTMB(Diversity_scale ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = lognormal,
ziformula = ~1, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = try_lognorm, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.944 0.392 0.160635 0.952 0.948 0.1410465 0.2099037 0.948 0.488 0.3268879 0.968 0.476 0.628 0.5 0.972 0.09914734 0.556 0.4 0.988 0.452 ...
try_LOGlognorm <- glmmTMB(log(Diversity_scale+1) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = lognormal,
ziformula = ~1, # I'm keeping this as 0-inflated lognormal alone was already over-dispersed. So I want to see if the log(x+1) transformation sufficiently brings in the long tail.
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = try_LOGlognorm, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.932 0.38 0.160635 0.94 0.92 0.1410465 0.2099037 0.94 0.488 0.3268879 0.944 0.476 0.644 0.508 0.956 0.09914734 0.568 0.4 0.976 0.448 ...
try_negbinom <- glmmTMB(as.integer(Diversity_scale*1000) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = nbinom2,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_negbinom, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9 0.484 0.02934308 0.888 0.932 0.001846165 0.1464347 0.844 0.524 0.04893932 0.928 0.545532 0.592 0.498831 0.964 0.08390474 0.62 0.436 0.972 0.496 ...
try_negbinom0 <- glmmTMB(as.integer(Diversity_scale*1000) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = nbinom2,
ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_negbinom0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.968 0.4760399 0.09975181 0.944 0.96 0.1870824 0.1012531 0.856 0.524 0.2883702 0.972 0.488 0.5489878 0.472 0.972 0.2933996 0.576 0.4214444 0.968 0.48 ...
try_poisson <- glmmTMB(as.integer(Diversity_scale*1000) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = genpois,
#ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_poisson, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.992 0.4616676 0.04975974 0.968 0.98 0.001601337 0.06714782 0.96 0.5715112 0.02849031 0.98 0.5892411 0.78 0.728 0.984 0.0152989 0.78 0.332584 0.984 0.4979502 ...
try_poisson0 <- glmmTMB(as.integer(Diversity_scale*1000) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit,
family = genpois,
ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_poisson0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.956 0.3819737 0.211971 0.968 0.964 0.01318631 0.2743196 0.924 0.3770803 0.244544 0.972 0.388 0.584 0.4491087 0.96 0.05561836 0.5923928 0.3206311 0.972 0.356 ...
# let's check this with AIC and BIC
AIC(try_gaussian, try_gamma, try_lognorm, try_LOGlognorm,
try_negbinom, try_negbinom0, try_poisson, try_poisson0) %>% arrange(AIC)
BIC(try_gaussian, try_gamma, try_lognorm, try_LOGlognorm,
try_negbinom, try_negbinom0, try_poisson, try_poisson0) %>% arrange(BIC)
# clean up
rm(try_gamma, try_lognorm, try_LOGlognorm, try_negbinom, try_negbinom0, try_poisson, try_poisson0)
According to the residuals, the zero-inflated negative binomial and the zero-inflated lognormal are about equally okay-ish. We could also take the AIC & BIC values in consideration for our decision but that is far less important. At the end of the day my decision is to go with the zero-inflated lognormal. The reason for this is because my understanding is that the most important thing to consider when selecting a GLM family is which family would a priori be the most natural choice. For diversity data, the Gamma or lognormal distributions are the most natural choices a priori because, for 4 species, the Shannon diversity data is a continuous variable between 0 and 1.386294. Therefore I think it makes sense to choose the lognormal (even if its residuals are not perfect).
Note that in the model fitting above I consider Day as a numeric
predictor. This is because I want to decide on the GLM family by
considering the complete data. (I was having problems with Day as an
un/ordered factor for reasons that are trouble-shooted in the
unevaluated code below.) For the rest of the analysis below, I consider
the effect of day (which is called Trtmt_Day) as a factor
representing either resistance (i.e., on the last day of heat) or
recovery.
##
## below is some old trouble shooting I had to do while glmmTMB was complaining about "dropping columns from rank-deficient conditional model"
## I found the following stack overflow string that was very helpful in diagnosing this problem
## https://stackoverflow.com/questions/78183492/using-glmmtmb-getting-warning-message-dropping-columns-from-rank-deficient-cond
## but in the end it is no longer relevant for later versions of the code because I am using both Day and CommRich as numeric instead of un/ordered factors.
##
# create all possible combinations for the 3 different predictors and display how many replicates there are for each combo
data_combinations <- with(diversity_forFit,
as.data.frame(table(Day, CommRich, Heat)))
print(data_combinations)
# from here we see that there's no data for Day 5 at Heat durations of 24 and 12 hours
# check to see which cases get dropped (even if data does exist for these combinations)
model_combinations <- with(model.frame(try_gaussian),
as.data.frame(table(Day, CommRich, Heat)))
print(model_combinations)
# these data.frames seem very similar but let's take advantage of same row order to check identity of Freq values
all(data_combinations$Freq == model_combinations$Freq)
# indeed, it seems that cases are NOT getting dropped for having NA values in the predictors
# (or, well, at least these cases were covered simply when we looked at the data_combinations above)
# see which combinations of variables were aliased,
X <- model.matrix(~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = diversity_forFit)
caret::findLinearCombos(X)
rm(data_combinations, model_combinations, try_gaussian)
In order to actually analyze the data, I need to split it up into 4 separate data-sets according to treatment and with its associated control. Remember that we need to correct for the fact that we are doing 4 independent tests. So instead of using the typical \(\alpha=0.05\) we use the Bonferroni corrected \(\alpha / n = 0.05/4 = 0.0125\) (or simply just 0.01).
# clean up from above
rm(try_gaussian)
####################
# 6h heat duration
####################
# grab just the treatment with its associated control data
diversity_6h <- rbind(diversity_forFit %>% filter(Heat == "6"),
diversity_forFit %>% filter(Heat == "control", Day < 4))
# create a column for last day of heat, first day of recovery, and last day of recovery
diversity_6h$Trtmt_Day <- "resist"
diversity_6h$Trtmt_Day[diversity_6h$Day == 2] <- "recov_1"
diversity_6h$Trtmt_Day[diversity_6h$Day == 3] <- "recov_2"
divers6h_H0 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day,
data = diversity_6h,
family = lognormal,
ziformula = ~Heat, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.752 0.772 0.6 0.684 0.02032453 0.504 0.82 0.392 0.2717882 0.892 0.568 0.468 0.832 0.488 0.596 0.356 0.52 0.88 0.2737051 0.704 ...
# note that putida is both in the zero inflation & fixed effect
divers6h_H1 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + putida,
data = diversity_6h,
family = lognormal,
ziformula = ~putida + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.7 0.72 0.516 0.596 0.01123198 0.352 0.776 0.624 0.5369474 0.94 0.464 0.324 0.776 0.692 0.516 0.248 0.392 0.828 0.521029 0.628 ...
divers6h_H2 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_6h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.06 0.628 0.288 0.972 0.0353005 0.068 0.676 0.788 0.4408272 0.86 0.86 0.696 0.192 0.768 0.744 0.836 0.112 0.792 0.4517783 0.928 ...
# unfortunately this also open up the possibility of interactions in fixed effects
divers6h_H2_1 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = diversity_6h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers6h_H3 <- glmmTMB(Diversity_scale ~ CommRich + Heat*Trtmt_Day*protegens,
data = diversity_6h,
family = lognormal,
ziformula = ~protegens*Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.164 0.704 0.344 0.956 0.0353005 0.084 0.756 0.68 0.4408272 0.904 0.764 0.616 0.352 0.684 0.64 0.708 0.132 0.832 0.4517783 0.88 ...
# check preferred models
anova(divers6h_H0, divers6h_H1)
anova(divers6h_H0, divers6h_H2)
anova(divers6h_H2, divers6h_H2_1)
AIC(divers6h_H0, divers6h_H1, divers6h_H2, divers6h_H2_1, divers6h_H3) %>% arrange(AIC)
BIC(divers6h_H0, divers6h_H1, divers6h_H2, divers6h_H2_1, divers6h_H3) %>% arrange(BIC)
# H3 is the preferred model
summary(divers6h_H3)
## Family: lognormal ( log )
## Formula: Diversity_scale ~ CommRich + Heat * Trtmt_Day * protegens
## Zero inflation: ~protegens * Heat
## Data: diversity_6h
##
## AIC BIC logLik deviance df.resid
## 148.4 214.8 -56.2 112.4 278
##
##
## Dispersion parameter for lognormal family (): 0.956
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.59704 0.24731 -6.458 1.06e-10 ***
## CommRich 0.42194 0.07079 5.961 2.51e-09 ***
## Heatcontrol 1.64725 0.16480 9.995 < 2e-16 ***
## Trtmt_Dayrecov_2 -0.23143 0.20554 -1.126 0.2602
## Trtmt_Dayresist 0.04455 0.19445 0.229 0.8188
## protegens -1.25094 0.24693 -5.066 4.06e-07 ***
## Heatcontrol:Trtmt_Dayrecov_2 -0.41996 0.24682 -1.701 0.0889 .
## Heatcontrol:Trtmt_Dayresist 0.27526 0.21478 1.282 0.2000
## Heatcontrol:protegens -1.36437 0.32097 -4.251 2.13e-05 ***
## Trtmt_Dayrecov_2:protegens 0.22891 0.36069 0.635 0.5257
## Trtmt_Dayresist:protegens -0.10227 0.34347 -0.298 0.7659
## Heatcontrol:Trtmt_Dayrecov_2:protegens 0.28165 0.46295 0.608 0.5429
## Heatcontrol:Trtmt_Dayresist:protegens -0.35707 0.43611 -0.819 0.4129
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -24.203 23862.841 -0.001 0.999
## protegens 24.280 23862.841 0.001 0.999
## Heatcontrol 1.242 27889.280 0.000 1.000
## protegens:Heatcontrol -1.818 27889.280 0.000 1.000
# create data.frame for plotting
divers_predict <- cbind(divers6h_H3$frame,
predict(divers6h_H3, type="response"))
colnames(divers_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(divers_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
# cleanup
rm(divers_predict)
# here's another way to plot this in case I'm interested later
#ggplot(diversity_6h %>%
# mutate(predictions = predict(divers6h_H2, type="response")),
# aes(x=Day, y=Diversity_scale, colour=as.factor(CommRich))) +
# facet_grid(~Heat) +
# geom_jitter(alpha=0.4) +
# geom_line(aes(y=predictions,
# group=paste(CommRich, protegens),
# linetype=as.factor(protegens))) +
# labs(y="Shannon diversity (rescaled)",
# colour="CommRich",
# linetype="protegens?")
####################
# 12h heat duration
####################
# grab just the treatment with its associated control data
diversity_12h <- rbind(diversity_forFit %>% filter(Heat == "12", Day > 1),
diversity_forFit %>% filter(Heat == "control", Day > 1, Day != 5))
# create a column for last day of heat, first day of recovery, and last day of recovery
diversity_12h$Trtmt_Day <- "resist"
diversity_12h$Trtmt_Day[diversity_12h$Day == 3] <- "recov_1"
diversity_12h$Trtmt_Day[diversity_12h$Day == 4] <- "recov_2"
divers12h_H0 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day,
data = diversity_12h,
family = lognormal,
ziformula = ~ Heat, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.668 0.2899017 0.716 0.4349943 0.564 0.4299888 0.1381018 0.988 0.1765794 0.4057324 0.3769693 0.596 0.4102262 0.4848194 0.684 0.1440025 0.712 0.62 0.796 0.3096465 ...
divers12h_H1 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + putida,
data = diversity_12h,
family = lognormal,
ziformula = ~putida + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.616 0.1968735 0.652 0.3332618 0.476 0.6214964 0.2513453 0.992 0.1289798 0.2940629 0.5016363 0.492 0.3240442 0.3374343 0.584 0.2880049 0.604 0.56 0.744 0.5283877 ...
divers12h_H2 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_12h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.312 0.08221093 0.9 0.4595505 0.248 0.5167092 0.2168199 0.98 0.2241791 0.554625 0.4125884 0.724 0.4653826 0.1784135 0.348 0.2505043 0.796 0.816 0.396 0.4460046 ...
divers12h_H2_1 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = diversity_12h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers12h_H3 <- glmmTMB(Diversity_scale ~ CommRich + Heat*Trtmt_Day*protegens,
data = diversity_12h,
family = lognormal,
ziformula = ~protegens*Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.536 0.1644219 0.86 0.4490264 0.4 0.4769623 0.2071527 0.992 0.2134308 0.5322911 0.3680645 0.608 0.4515935 0.2676203 0.516 0.234004 0.744 0.744 0.496 0.4033927 ...
# check preferred models
anova(divers12h_H0, divers12h_H1)
anova(divers12h_H0, divers12h_H2)
anova(divers12h_H2, divers12h_H2_1)
AIC(divers12h_H0, divers12h_H1, divers12h_H2, divers12h_H2_1, divers12h_H3) %>% arrange(AIC)
BIC(divers12h_H0, divers12h_H1, divers12h_H2, divers12h_H2_1, divers12h_H3) %>% arrange(BIC)
# H2 is the preferred model (even if its residuals are the worst)
summary(divers12h_H2)
## Family: lognormal ( log )
## Formula: Diversity_scale ~ CommRich * Heat * Trtmt_Day + protegens
## Zero inflation: ~protegens + Heat
## Data: diversity_12h
##
## AIC BIC logLik deviance df.resid
## 298.0 359.9 -132.0 264.0 265
##
##
## Dispersion parameter for lognormal family (): 1.78
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.7206 0.7647 -2.250 0.0244 *
## CommRich 0.4704 0.2778 1.693 0.0904 .
## Heatcontrol 2.0434 0.8447 2.419 0.0156 *
## Trtmt_Dayrecov_2 0.5381 1.0040 0.536 0.5920
## Trtmt_Dayresist 0.8408 0.9606 0.875 0.3814
## protegens -1.4056 0.1522 -9.236 <2e-16 ***
## CommRich:Heatcontrol -0.3929 0.3173 -1.238 0.2156
## CommRich:Trtmt_Dayrecov_2 -0.1754 0.3647 -0.481 0.6305
## CommRich:Trtmt_Dayresist -0.2772 0.3565 -0.778 0.4368
## Heatcontrol:Trtmt_Dayrecov_2 -0.9662 1.1819 -0.818 0.4136
## Heatcontrol:Trtmt_Dayresist 0.0201 1.1084 0.018 0.9855
## CommRich:Heatcontrol:Trtmt_Dayrecov_2 0.3709 0.4408 0.841 0.4001
## CommRich:Heatcontrol:Trtmt_Dayresist 0.0818 0.4194 0.195 0.8454
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.6438 0.3265 -5.035 4.79e-07 ***
## protegens 2.0186 0.3447 5.856 4.75e-09 ***
## Heatcontrol -0.5982 0.2683 -2.229 0.0258 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# plot the model predictions against the data
ggplot(diversity_12h %>%
mutate(predictions = predict(divers12h_H2, type="response")),
aes(x=Day, y=Diversity_scale, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predictions, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
####################
# 24h heat duration
####################
# grab just the treatment with its associated control data
diversity_24h <- rbind(diversity_forFit %>% filter(Heat == "24", Day > 1),
diversity_forFit %>% filter(Heat == "control", Day > 1, Day != 5))
# create a column for last day of heat, first day of recovery, and last day of recovery
diversity_24h$Trtmt_Day <- "resist"
diversity_24h$Trtmt_Day[diversity_24h$Day == 3] <- "recov_1"
diversity_24h$Trtmt_Day[diversity_24h$Day == 4] <- "recov_2"
divers24h_H0 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day,
data = diversity_24h,
family = lognormal,
ziformula = ~Heat, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.812 0.62 0.2305388 0.304 0.624 0.88 0.1507346 0.0060084 0.732 0.352 0.06998273 0.856 0.125618 0.0594783 0.34 0.556 0.824 0.09148179 0.808 0.548 ...
divers24h_H1 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + putida,
data = diversity_24h,
family = lognormal,
ziformula = ~putida + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.808 0.556 0.1075848 0.224 0.524 0.86 0.2342496 0.00916281 0.804 0.264 0.0436256 0.844 0.193855 0.03568698 0.252 0.504 0.8 0.142862 0.748 0.48 ...
divers24h_H2 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_24h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.696 0.272 0.3304389 0.54 0.236 0.78 0.1975845 0.007134975 0.46 0.48 0.09633986 0.528 0.1597365 0.08601477 0.692 0.232 0.688 0.1228112 0.9 0.692 ...
divers24h_H2_1 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = diversity_24h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers24h_H3 <- glmmTMB(Diversity_scale ~ CommRich + Heat*Trtmt_Day*protegens,
data = diversity_24h,
family = lognormal,
ziformula = ~protegens*Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.772 0.308 0.2728042 0.552 0.316 0.836 0.175178 0.00645903 0.5 0.508 0.08179799 0.512 0.1426773 0.07045891 0.74 0.284 0.756 0.1015072 0.904 0.688 ...
# check preferred models
anova(divers24h_H0, divers24h_H1)
anova(divers24h_H0, divers24h_H2)
anova(divers24h_H2, divers24h_H2_1)
AIC(divers24h_H0, divers24h_H1, divers24h_H2, divers24h_H2_1, divers24h_H3) %>% arrange(AIC)
BIC(divers24h_H0, divers24h_H1, divers24h_H2, divers24h_H2_1, divers24h_H3) %>% arrange(BIC)
# H2 is the preferred model (even if its residuals are the worst)
summary(divers24h_H2)
## Family: lognormal ( log )
## Formula: Diversity_scale ~ CommRich * Heat * Trtmt_Day + protegens
## Zero inflation: ~protegens + Heat
## Data: diversity_24h
##
## AIC BIC logLik deviance df.resid
## 359.8 420.9 -162.9 325.8 253
##
##
## Dispersion parameter for lognormal family (): 1.86
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.82597 0.56916 -3.208 0.001336 **
## CommRich 0.59430 0.20778 2.860 0.004233 **
## Heatcontrol 2.25305 0.68381 3.295 0.000985 ***
## Trtmt_Dayrecov_2 0.02916 0.98732 0.030 0.976442
## Trtmt_Dayresist 1.39241 0.69548 2.002 0.045276 *
## protegens -1.31145 0.13846 -9.472 < 2e-16 ***
## CommRich:Heatcontrol -0.56620 0.26170 -2.164 0.030495 *
## CommRich:Trtmt_Dayrecov_2 -0.05146 0.36028 -0.143 0.886426
## CommRich:Trtmt_Dayresist -0.23534 0.26667 -0.883 0.377498
## Heatcontrol:Trtmt_Dayrecov_2 -0.43534 1.17505 -0.370 0.711022
## Heatcontrol:Trtmt_Dayresist -0.54918 0.89533 -0.613 0.539623
## CommRich:Heatcontrol:Trtmt_Dayrecov_2 0.24149 0.44050 0.548 0.583542
## CommRich:Heatcontrol:Trtmt_Dayresist 0.04106 0.34901 0.118 0.906350
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.1039 0.3485 -6.037 1.57e-09 ***
## protegens 1.6731 0.3557 4.704 2.55e-06 ***
## Heatcontrol 0.1486 0.2773 0.536 0.592
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# plot the model predictions against the data
ggplot(diversity_24h %>%
mutate(predictions = predict(divers24h_H2, type="response")),
aes(x=Day, y=Diversity_scale, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predictions, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
####################
# 48h heat duration
####################
# grab just the treatment with its associated control data
diversity_48h <- rbind(diversity_forFit %>% filter(Heat == "48", Day > 2),
diversity_forFit %>% filter(Heat == "control", Day > 2))
# create a column for last day of heat, first day of recovery, and last day of recovery
diversity_48h$Trtmt_Day <- "resist"
diversity_48h$Trtmt_Day[diversity_48h$Day == 4] <- "recov_1"
diversity_48h$Trtmt_Day[diversity_48h$Day == 5] <- "recov_2"
# drop the resistance data altogether from 48h treatment because glmmTMB fails to converge, saying that there's a Non-positive definite (NPD) Hessian
# Running diagnose(<model>) tells us that the likelihood surface is flat near the MLE and that this is happening for parameters Trtmt_Dayresist, CommRich:Trtmt_Dayresist, Heatcontrol:Trtmt_Dayresist, and CommRich:Heatcontrol:Trtmt_Dayresist
# This is likely because of all the NA values during the last day of heat for this longest duration
diversity_48h <- diversity_48h %>% filter(Trtmt_Day != "resist")
divers48h_H0 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day,
data = diversity_48h,
family = lognormal,
ziformula = ~Heat, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers48h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.5691481 0.6128799 0.4948025 0.4771902 0.2877201 0.5332018 0.4559984 0.8213178 0.7989364 0.546471 0.3050185 0.92 0.5761426 0.6870577 0.79712 0.7339274 0.952 0.7713301 0.8462497 0.03518785 ...
divers48h_H1 <- glmmTMB(Diversity_scale ~ CommRich*Heat + putida,
data = diversity_48h,
family = lognormal,
ziformula = ~putida + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers48h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.5290673 0.5823746 0.4690071 0.4449476 0.2663084 0.5774305 0.5106341 0.7791005 0.7621191 0.5134286 0.3227205 0.892 0.5506496 0.6356889 0.7568983 0.7967411 0.912 0.7336206 0.791145 0.0390796 ...
divers48h_H2 <- glmmTMB(Diversity_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_48h,
family = lognormal,
ziformula = ~protegens + Heat, # this is the difference!!!
control = glmmTMBControl(optCtrl = list(iter.max = 1e6, eval.max = 1e6)))
simulateResiduals(fittedModel = divers48h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.5130349 0.5629621 0.5229429 0.4922367 0.2609555 0.5577733 0.4854176 0.8596971 0.832072 0.5083451 0.315912 0.936 0.5914385 0.6292678 0.7349592 0.7405394 0.98 0.8021833 0.7675288 0.03664726 ...
divers48h_H2_1 <- glmmTMB(Diversity_scale ~ protegens + Heat + CommRich,
data = diversity_48h,
family = lognormal,
ziformula = ~protegens + Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers48h_H3 <- glmmTMB(Diversity_scale ~ CommRich + Heat*Trtmt_Day*protegens,
data = diversity_48h,
family = lognormal,
ziformula = ~protegens*Heat,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
simulateResiduals(fittedModel = divers48h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6680142 0.6933031 0.4596269 0.4234526 0.3345583 0.4865159 0.4244777 0.7560729 0.7363469 0.6354314 0.2777847 0.892 0.5379031 0.8026375 0.9141284 0.6810318 0.908 0.7233362 0.9840112 0.03275551 ...
# check preferred models
anova(divers48h_H0, divers48h_H1)
anova(divers48h_H0, divers48h_H2)
anova(divers48h_H2, divers48h_H2_1)
AIC(divers48h_H0, divers48h_H1, divers48h_H2, divers48h_H2_1, divers48h_H3) %>% arrange(AIC)
BIC(divers48h_H0, divers48h_H1, divers48h_H2, divers48h_H2_1, divers48h_H3) %>% arrange(BIC)
# H2_1 is the preferred model because it has just as much explanatory power as H2
summary(divers48h_H2_1)
## Family: lognormal ( log )
## Formula: Diversity_scale ~ protegens + Heat + CommRich
## Zero inflation: ~protegens + Heat
## Data: diversity_48h
##
## AIC BIC logLik deviance df.resid
## 183.2 208.6 -83.6 167.2 168
##
##
## Dispersion parameter for lognormal family (): 1.38
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.4541 0.4912 -0.925 0.3552
## protegens -1.9838 0.2237 -8.867 <2e-16 ***
## Heatcontrol 0.2092 0.2860 0.731 0.4645
## CommRich 0.3762 0.1405 2.678 0.0074 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.4043 0.3732 3.763 0.000168 ***
## protegens 0.8513 0.4035 2.110 0.034879 *
## Heatcontrol -2.6845 0.4016 -6.685 2.32e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# plot the model predictions against the data
ggplot(diversity_48h %>%
mutate(predictions = predict(divers48h_H2_1, type="response")),
aes(x=Day, y=Diversity_scale, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predictions, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich",
title="H2_1: diversity ~ protegens + Heat + CommRich")
# but I will use H2 below because it is consistent with the rest of the data
summary(divers48h_H2)
## Family: lognormal ( log )
## Formula: Diversity_scale ~ CommRich * Heat * Trtmt_Day + protegens
## Zero inflation: ~protegens + Heat
## Data: diversity_48h
##
## AIC BIC logLik deviance df.resid
## 186.8 228.0 -80.4 160.8 163
##
##
## Dispersion parameter for lognormal family (): 1.21
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.1423 1.6926 -0.084 0.933
## CommRich 0.2516 0.4672 0.538 0.590
## Heatcontrol -0.8708 1.7798 -0.489 0.625
## Trtmt_Dayrecov_2 0.8891 2.0089 0.443 0.658
## protegens -2.1234 0.2185 -9.718 <2e-16 ***
## CommRich:Heatcontrol 0.3805 0.5053 0.753 0.451
## CommRich:Trtmt_Dayrecov_2 -0.1598 0.5742 -0.278 0.781
## Heatcontrol:Trtmt_Dayrecov_2 0.1314 2.0871 0.063 0.950
## CommRich:Heatcontrol:Trtmt_Dayrecov_2 -0.1552 0.6154 -0.252 0.801
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.4043 0.3732 3.763 0.000168 ***
## protegens 0.8513 0.4035 2.110 0.034880 *
## Heatcontrol -2.6845 0.4016 -6.685 2.32e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ggplot(diversity_48h %>%
mutate(predictions = predict(divers48h_H2, type="response")),
aes(x=Day, y=Diversity_scale, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predictions, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich",
title="H2: diversity ~ CommRich*Heat*Trtmt_Day + protegens")
############################
# effect sizes
############################
# get the effect size & post-hoc p-values
## remember that we need to correct for multiple comparisons that were generated by subsetting the data into 4 parts
## THEREFORE CONSIDER ALPHA/N = 0.05/4 = 0.0125 as the threshold for significance
emm_6h <- emmeans(divers6h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = diversity_6h, type = "response")
effect_6h <- eff_size(emm_6h, sigma(divers6h_H2), edf = df.residual(divers6h_H2))
emm_12h <- emmeans(divers12h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = diversity_12h, type = "response")
effect_12h <- eff_size(emm_12h, sigma(divers12h_H2), edf = df.residual(divers12h_H2))
emm_24h <- emmeans(divers24h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = diversity_24h, type = "response")
effect_24h <- eff_size(emm_24h, sigma(divers24h_H2), edf = df.residual(divers24h_H2))
emm_48h <- emmeans(divers48h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = diversity_48h, type = "response")
effect_48h <- eff_size(emm_48h, sigma(divers48h_H2), edf = df.residual(divers48h_H2))
# a function that extracts the confidence intervals from eff_size *** contingent on protegens ***
get_effsize_CIs <- function(eff_size_object, heat_trtmt) {
data.frame(Heat = heat_trtmt,
CommRich = confint(eff_size_object)[[2]],
Trtmt_Day = confint(eff_size_object)[[3]],
protegens = confint(eff_size_object)[[4]],
est = confint(eff_size_object)[[5]],
loCI = confint(eff_size_object)[[8]],
hiCI = confint(eff_size_object)[[9]])
}
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
div_effects_protegens <- data.frame()
div_effects_protegens <- rbind(div_effects_protegens,
get_effsize_CIs(effect_6h, heat_trtmt = 6),
get_effsize_CIs(effect_12h, heat_trtmt = 12),
get_effsize_CIs(effect_24h, heat_trtmt = 24),
get_effsize_CIs(effect_48h, heat_trtmt = 48),
c(48, NA, "resist", 0, rep(NA, 3)),
c(48, NA, "resist", 1, rep(NA, 3)))
# several columns were coerced to character. They need to be put back as numeric values
div_effects_protegens[,-3] <- sapply(div_effects_protegens[,-3], as.numeric) # change all but 3rd column into numeric
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects_protegens$Trtmt_Day <- factor(div_effects_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot conditional part of the model
ggplot(div_effects_protegens,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day, shape = as.logical(protegens))) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(title = "Effect size of conditional model",
x = "Effect Size on Shannon Diversity",
shape = "protegens\npresent?",
y="Heat duration")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_errorbarh()`).
# okay so we have confirmed that the presence of protegens doesn't actually interact with heat
# so let's average across the effect of protegens.
posthoc_6h <- emmeans(effect_6h, pairwise ~ Trtmt_Day*CommRich, data = diversity_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h, pairwise ~ Trtmt_Day*CommRich, data = diversity_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h, pairwise ~ Trtmt_Day*CommRich, data = diversity_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h, pairwise ~ Trtmt_Day*CommRich, data = diversity_48h)
## NOTE: Results may be misleading due to involvement in interactions
# a function that extracts the confidence intervals from a post-hoc object
get_posthoc <- function(posthoc_object, heat_trtmt) {
output <- multcomp::cld(posthoc_object, alpha=0.05/4, Letters = letters) %>%
data.frame() %>%
select(-df)
colnames(output)[3:7] <- c("est", "SE", "loCI", "hiCI", "groups")
output$Heat <- heat_trtmt
return(output)
}
div_effects <- data.frame()
div_effects <- rbind(div_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24),
get_posthoc(posthoc_48h, heat_trtmt = 48),
c("resis", rep(NA, 6), 48))
## Warning in `[<-.factor`(`*tmp*`, ri, value = "resis"): invalid factor level, NA
## generated
# several columns were coerced to character. They need to be put back as numeric values
div_effects[,c(-1, -7)] <- sapply(div_effects[,c(-1, -7)], as.numeric)
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects$Trtmt_Day <- factor(div_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot with group labels
ggplot(div_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day)) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-1.3, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Shannon Diversity",
y="Heat duration")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_errorbarh()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_text()`).
# plot without group labels
ggplot(div_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day)) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Shannon Diversity",
y="Heat duration")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_errorbarh()`).
#######
# finally, we will do a series of pairwise two-tailed t-tests to compare between heat durations
#######
# a function that approximates the sample size from each data subset
estimate_n <- function(data_subset, CommRich = FALSE) {
if(CommRich == 0) {
# get the number of unique ID's present on recovery day 2 for the heat treatment
# then divide this by 4 as we want to know the average sample size across CommRich
output <- length(unique(data_subset[data_subset$Heat != "control" & data_subset$Trtmt_Day == "recov_2",]$uniqID))/4
}
if(CommRich == 1){ # do the same thing for specified values of CommRich
output <- length(unique(data_subset[data_subset$Heat != "control" & data_subset$Trtmt_Day == "recov_2" & data_subset$CommRich == 1,]$uniqID))/4
}
if(CommRich == 2){
output <- length(unique(data_subset[data_subset$Heat != "control" & data_subset$Trtmt_Day == "recov_2" & data_subset$CommRich == 2,]$uniqID))/4
}
if(CommRich == 3){
output <- length(unique(data_subset[data_subset$Heat != "control" & data_subset$Trtmt_Day == "recov_2" & data_subset$CommRich == 3,]$uniqID))/4
}
if(CommRich == 4){
output <- length(unique(data_subset[data_subset$Heat != "control" & data_subset$Trtmt_Day == "recov_2" & data_subset$CommRich == 4,]$uniqID))/4
}
return(output)
}
# a function that runs two-tailed t-test between row numbers of diversity_effects df
run_ttest <- function(row_x, row_y,
summary_stats_df){
ttest_object <- tsum.test(mean.x = summary_stats_df$est[row_x],
s.x = summary_stats_df$SE[row_x],
n.x = summary_stats_df$n[row_x],
mean.y = summary_stats_df$est[row_y],
s.y = summary_stats_df$SE[row_y],
n.y = summary_stats_df$n[row_y],
alternative="two.sided")
return(data.frame(t_statistic = ttest_object$statistic,
df = ttest_object$parameters,
pvalue = ttest_object$p.value))
}
# estimate the sample sizes
temp <- div_effects # copy the effects to temp
div_effects <- rbind(temp %>% filter(Heat == 6) %>% mutate(n = estimate_n(diversity_6h)),
temp %>% filter(Heat == 12) %>% mutate(n = estimate_n(diversity_12h)),
temp %>% filter(Heat == 24) %>% mutate(n = estimate_n(diversity_24h)),
temp %>% filter(Heat == 48) %>% mutate(n = estimate_n(diversity_48h)))
rm(temp)
# estimate the SD from the SE
div_effects <- div_effects %>% mutate(SD = SE * sqrt(n)) %>%
# re-order by Heat and Trtmt_Day
arrange(Heat, Trtmt_Day)
# all pairwise combinations of comparisons between the same treatment day for different durations
temp <- t(combn(c(1,4,7,10), 2))
combos <- rbind(temp, temp+1, temp+2)
rm(temp)
# loop through all the combinations and do the t-tests
divEffects_ttests <- data.frame()
for(i in 1:nrow(combos)){
divEffects_ttests <- rbind(divEffects_ttests,
run_ttest(row_x = combos[i,1],
row_y = combos[i,2],
summary_stats_df = div_effects))
}
divEffects_ttests$adjusted_p <- p.adjust(divEffects_ttests$pvalue, method = "bonferroni")
divEffects_ttests$Trtmt_Day <- div_effects$Trtmt_Day[combos[,1]]
divEffects_ttests$Heat_1 <- div_effects$Heat[combos[,1]]
divEffects_ttests$Heat_2 <- div_effects$Heat[combos[,2]]
print(divEffects_ttests)
## t_statistic df pvalue adjusted_p Trtmt_Day Heat_1
## t -2.9367330 24.21319 7.168272e-03 1.075241e-01 Recovery (Day 2) 6
## t1 -3.5904673 22.88874 1.554447e-03 2.331670e-02 Recovery (Day 2) 6
## t2 -8.0844536 12.59581 2.456962e-06 3.685442e-05 Recovery (Day 2) 6
## t3 -0.7137804 22.33286 4.827566e-01 1.000000e+00 Recovery (Day 2) 12
## t4 -6.8230618 12.62259 1.419776e-05 2.129665e-04 Recovery (Day 2) 12
## t5 -6.4839374 12.79304 2.218106e-05 3.327158e-04 Recovery (Day 2) 24
## t6 -6.4278372 23.66885 1.283193e-06 1.924789e-05 Recovery (Day 1) 6
## t7 -10.4942024 23.23869 2.722640e-10 4.083960e-09 Recovery (Day 1) 6
## t8 -5.0732546 11.23554 3.358669e-04 5.038003e-03 Recovery (Day 1) 6
## t9 -3.2050257 22.10628 4.066451e-03 6.099676e-02 Recovery (Day 1) 12
## t10 -3.2705320 11.44590 7.097273e-03 1.064591e-01 Recovery (Day 1) 12
## t11 -2.4194681 11.15958 3.374741e-02 5.062111e-01 Recovery (Day 1) 24
## t12 -5.2051760 24.24834 2.399768e-05 3.599652e-04 Resistance 6
## t13 -18.3959780 22.59847 4.315210e-15 6.472815e-14 Resistance 6
## t14 NA NA NA NA Resistance 6
## t15 -13.0086462 21.91706 8.778384e-12 1.316758e-10 Resistance 12
## t16 NA NA NA NA Resistance 12
## t17 NA NA NA NA Resistance 24
## Heat_2
## t 12
## t1 24
## t2 48
## t3 24
## t4 48
## t5 48
## t6 12
## t7 24
## t8 48
## t9 24
## t10 48
## t11 48
## t12 12
## t13 24
## t14 48
## t15 24
## t16 48
## t17 48
# and we can also plot the zero inflated model effect sizes:
effect_6h_zi <- eff_size(emmeans(divers6h_H2, ~ Heat, data = diversity_6h, component = "zi"),
sigma(divers6h_H2),
edf = df.residual(divers6h_H2))
effect_12h_zi <- eff_size(emmeans(divers12h_H2, ~ Heat, data = diversity_12h, component = "zi"),
sigma(divers12h_H2),
edf = df.residual(divers12h_H2))
effect_24h_zi <- eff_size(emmeans(divers24h_H2, ~ Heat, data = diversity_24h, component = "zi"),
sigma(divers24h_H2),
edf = df.residual(divers24h_H2))
effect_48h_zi <- eff_size(emmeans(divers48h_H2_1, ~ Heat, data = diversity_48h, component = "zi"),
sigma(divers48h_H2_1),
edf = df.residual(divers48h_H2_1))
# a function to extract CIs from eff_size on ZERO INFLATED part *** averaging across protegens ***
get_ZI_effsize_CIs <- function(eff_size_object, heat_trtmt) {
data.frame(Heat = heat_trtmt,
ZIest = confint(eff_size_object)[[2]],
ZIloCI = confint(eff_size_object)[[5]],
ZIhiCI = confint(eff_size_object)[[6]])
}
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
div_ZIeff <- data.frame()
div_ZIeff <- rbind(div_ZIeff,
get_ZI_effsize_CIs(effect_6h_zi, heat_trtmt = 6),
get_ZI_effsize_CIs(effect_12h_zi, heat_trtmt = 12),
get_ZI_effsize_CIs(effect_24h_zi, heat_trtmt = 24),
get_ZI_effsize_CIs(effect_48h_zi, heat_trtmt = 48))
ggplot(div_ZIeff,
aes(x = ZIest, y = as.factor(Heat))) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = ZIloCI, xmax = ZIhiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(title = "Effect size of zero-inflated model",
x = "Effect Size on Shannon Diversity",
y="Heat duration")
# a function to get CI's
hack_CIs <- function(cond_loCI, cond_hiCI, ZI_loCI, ZI_hiCI){
one <- cond_loCI * (1 - ZI_loCI)
two <- cond_loCI * (1 - ZI_hiCI)
three <- cond_hiCI * (1 - ZI_loCI)
four <- cond_hiCI * (1 - ZI_hiCI)
array <- cbind(one, two, three, four)
return(list(apply(array, 1, min),
apply(array, 1, max)))
}
# we can also be super hacky and estimate an overall effect size that combines the conditional and zero-inflated parts of the model:
div_overall_effects <- inner_join(div_effects, div_ZIeff, by = "Heat") %>%
mutate(overall_est = est*(1 - ZIest),
overall_loCI = hack_CIs(loCI, hiCI, ZIloCI, ZIhiCI)[[1]],
overall_hiCI = hack_CIs(loCI, hiCI, ZIloCI, ZIhiCI)[[2]]) %>%
select(Trtmt_Day, Heat, overall_est, overall_loCI, overall_hiCI)
# plot
ggplot(div_overall_effects,
aes(x = overall_est, y = as.factor(Heat), colour = Trtmt_Day)) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = overall_loCI, xmax = overall_hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Shannon Diversity",
y = "Heat duration",
title = "hacky combination of cond & ZI effects")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_errorbarh()`).
# cleanup
rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h,
divers12h_H0, divers12h_H1, divers12h_H2, emm_12h, effect_12h,
divers24h_H0, divers24h_H1, divers24h_H2, emm_24h, effect_24h,
divers48h_H0, divers48h_H1, divers48h_H2, emm_48h, effect_48h,
divers6h_H2_1, divers12h_H2_1, divers24h_H2_1, divers48h_H2_1,
div_effects_protegens, divEffects_ttests,
effect_6h_zi, effect_12h_zi, effect_24h_zi, effect_48h_zi,
posthoc_6h, posthoc_12h, posthoc_24h, posthoc_48h,
get_ZI_effsize_CIs, div_ZIeff,
div_effects, div_overall_effects)
This is a bit confusing because the effect sizes are split up over the conditional and the zero-inflated parts of the model. The overall effect size is: (1 - zi)*(cond mean). See this stackover flow thread.
Let’s try the analysis again but now adding a small value to all diversity estimates such that we remove the zero’s. This way we will no longer need to use a zero-inflated part in the model and the effect sizes will be simpler to explain. (Especially because I think I am losing a lot of power for the 48h treatment as a result of everything literally being 0.)
# find the smallest non-zero value in the rescaled diversity estimate
smallest_diversity <- min(diversity_forFit$Diversity_scale[diversity_forFit$Diversity_scale != 0], na.rm=TRUE)
# now add 1/100th of that value to all the diversity estimates and re-do all the analyses I did above...
####################
# 6h heat duration
####################
# add small value to diversity
diversity_6h$Diversity_scalePLUSepsilon <- diversity_6h$Diversity_scale + smallest_diversity/100
divers6h_H0 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day,
data = diversity_6h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.728 0.82 0.78 0.768 0.128 0.624 0.852 0.516 0.096 0.896 0.592 0.448 0.816 0.476 0.532 0.492 0.652 0.888 0.128 0.684 ...
divers6h_H1 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + putida,
data = diversity_6h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.672 0.752 0.66 0.668 0.048 0.472 0.772 0.728 0.216 0.964 0.54 0.348 0.776 0.728 0.524 0.288 0.54 0.84 0.248 0.62 ...
divers6h_H2 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_6h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H2, plot = TRUE)
## qu = 0.75, log(sigma) = -2.600131 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -2.561174 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.036 0.664 0.292 0.996 0.372 0.068 0.704 0.884 0.352 0.88 0.952 0.744 0.172 0.8 0.804 0.9 0.132 0.804 0.36 0.968 ...
# let's try categorical CommRich
divers6h_H2_1 <- glmmTMB(Diversity_scalePLUSepsilon ~ as.factor(CommRich)*Heat*Trtmt_Day + protegens,
data = diversity_6h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H2, plot = TRUE)
## qu = 0.75, log(sigma) = -2.600131 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -2.561174 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.036 0.664 0.292 0.996 0.372 0.068 0.704 0.884 0.352 0.88 0.952 0.744 0.172 0.8 0.804 0.9 0.132 0.804 0.36 0.968 ...
divers6h_H3 <- glmmTMB(Diversity_scalePLUSepsilon ~ Heat*Trtmt_Day*protegens + CommRich,
data = diversity_6h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers6h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.108 0.72 0.404 0.984 0.252 0.12 0.772 0.84 0.216 0.912 0.868 0.68 0.288 0.736 0.7 0.844 0.18 0.876 0.244 0.94 ...
# check preferred models
anova(divers6h_H0, divers6h_H1)
anova(divers6h_H0, divers6h_H2)
AIC(divers6h_H0, divers6h_H1, divers6h_H2, divers6h_H2_1, divers6h_H3) %>% arrange(AIC)
BIC(divers6h_H0, divers6h_H1, divers6h_H2, divers6h_H2_1, divers6h_H3) %>% arrange(BIC)
# H3 is the preferred model
summary(divers6h_H3)
## Family: lognormal ( log )
## Formula:
## Diversity_scalePLUSepsilon ~ Heat * Trtmt_Day * protegens + CommRich
## Data: diversity_6h
##
## AIC BIC logLik deviance df.resid
## -1182.1 -1130.4 605.1 -1210.1 282
##
##
## Dispersion parameter for lognormal family (): 1.18
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.00048 0.25042 -7.988 1.37e-15 ***
## Heatcontrol 1.59789 0.16921 9.443 < 2e-16 ***
## Trtmt_Dayrecov_2 -0.22635 0.20903 -1.083 0.27887
## Trtmt_Dayresist 0.03897 0.19819 0.197 0.84410
## protegens -3.08242 0.25613 -12.034 < 2e-16 ***
## CommRich 0.63449 0.06871 9.234 < 2e-16 ***
## Heatcontrol:Trtmt_Dayrecov_2 -0.32853 0.24958 -1.316 0.18807
## Heatcontrol:Trtmt_Dayresist 0.23018 0.22159 1.039 0.29892
## Heatcontrol:protegens -1.02576 0.34761 -2.951 0.00317 **
## Trtmt_Dayrecov_2:protegens 0.17485 0.36999 0.473 0.63651
## Trtmt_Dayresist:protegens 0.16876 0.35912 0.470 0.63841
## Heatcontrol:Trtmt_Dayrecov_2:protegens 0.33225 0.50143 0.663 0.50758
## Heatcontrol:Trtmt_Dayresist:protegens -0.09462 0.47585 -0.199 0.84239
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
divers_predict <- cbind(divers6h_H3$frame,
predict(divers6h_H3, type="response"))
colnames(divers_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(divers_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
# cleanup
rm(divers_predict)
####################
# 12h heat duration
####################
# add small value to diversity
diversity_12h$Diversity_scalePLUSepsilon <- diversity_12h$Diversity_scale + smallest_diversity/100
divers12h_H0 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day,
data = diversity_12h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.776 0.212 0.828 0.224 0.776 0.18 0.172 0.976 0.144 0.132 0.18 0.496 0.224 0.172 0.772 0.18 0.688 0.648 0.796 0.168 ...
divers12h_H1 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + putida,
data = diversity_12h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.708 0.116 0.756 0.132 0.684 0.292 0.3 0.988 0.076 0.1 0.36 0.48 0.12 0.124 0.724 0.304 0.648 0.6 0.768 0.3 ...
divers12h_H2 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_12h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.532 0.004 0.988 0.516 0.5 0.464 0.432 0.98 0.18 0.188 0.252 0.488 0.492 0.008 0.608 0.452 0.88 0.856 0.3 0.208 ...
divers12h_H2_1 <- glmmTMB(Diversity_scalePLUSepsilon ~ as.factor(CommRich)*Heat*Trtmt_Day + protegens,
data = diversity_12h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers12h_H3 <- glmmTMB(Diversity_scalePLUSepsilon ~ Heat*Trtmt_Day*protegens + CommRich,
data = diversity_12h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers12h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.724 0.016 0.984 0.364 0.684 0.316 0.312 0.992 0.116 0.116 0.148 0.468 0.34 0.012 0.764 0.32 0.868 0.844 0.56 0.152 ...
# check preferred models
anova(divers12h_H0, divers12h_H1)
anova(divers12h_H0, divers12h_H2)
anova(divers12h_H2, divers12h_H2_1)
AIC(divers12h_H0, divers12h_H1, divers12h_H2, divers12h_H2_1, divers12h_H3) %>% arrange(AIC)
BIC(divers12h_H0, divers12h_H1, divers12h_H2, divers12h_H2_1, divers12h_H3) %>% arrange(BIC)
# H3 is the preferred model (even if its residuals are the worst)
summary(divers12h_H3)
## Family: lognormal ( log )
## Formula:
## Diversity_scalePLUSepsilon ~ Heat * Trtmt_Day * protegens + CommRich
## Data: diversity_12h
##
## AIC BIC logLik deviance df.resid
## -1427.6 -1376.6 727.8 -1455.6 268
##
##
## Dispersion parameter for lognormal family (): 1.98
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.19483 0.41565 -10.092 < 2e-16 ***
## Heatcontrol 3.01177 0.31709 9.498 < 2e-16 ***
## Trtmt_Dayrecov_2 0.44382 0.39899 1.112 0.265987
## Trtmt_Dayresist 0.78636 0.38663 2.034 0.041960 *
## protegens -1.30322 0.37042 -3.518 0.000434 ***
## CommRich 0.82326 0.09050 9.096 < 2e-16 ***
## Heatcontrol:Trtmt_Dayrecov_2 -0.51243 0.43566 -1.176 0.239504
## Heatcontrol:Trtmt_Dayresist -0.35215 0.41631 -0.846 0.397621
## Heatcontrol:protegens -2.28494 0.44927 -5.086 3.66e-07 ***
## Trtmt_Dayrecov_2:protegens -0.07363 0.50735 -0.145 0.884607
## Trtmt_Dayresist:protegens -0.67251 0.49943 -1.347 0.178122
## Heatcontrol:Trtmt_Dayrecov_2:protegens -0.48255 0.64060 -0.753 0.451282
## Heatcontrol:Trtmt_Dayresist:protegens 0.29600 0.61185 0.484 0.628545
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
divers_predict <- cbind(divers12h_H3$frame,
predict(divers12h_H3, type="response"))
colnames(divers_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(divers_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
# cleanup
rm(divers_predict)
####################
# 24h heat duration
####################
# add small value to diversity
diversity_24h$Diversity_scalePLUSepsilon <- diversity_24h$Diversity_scale + smallest_diversity/100
divers24h_H0 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day,
data = diversity_24h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9 0.792 0.076 0.528 0.812 0.904 0.116 0.06 0.86 0.652 0.1 0.896 0.096 0.076 0.672 0.788 0.864 0.096 0.832 0.716 ...
divers24h_H1 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + putida,
data = diversity_24h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.88 0.736 0.028 0.384 0.716 0.84 0.18 0.104 0.904 0.596 0.044 0.864 0.176 0.044 0.556 0.72 0.788 0.132 0.796 0.648 ...
divers24h_H2 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_24h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.744 0.508 0.148 0.74 0.42 0.672 0.188 0.124 0.568 0.796 0.092 0.464 0.096 0.028 0.876 0.444 0.624 0.14 0.936 0.832 ...
divers24h_H2_1 <- glmmTMB(Diversity_scalePLUSepsilon ~ as.factor(CommRich)*Heat*Trtmt_Day + protegens,
data = diversity_24h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers24h_H3 <- glmmTMB(Diversity_scalePLUSepsilon ~ Heat*Trtmt_Day*protegens + CommRich,
data = diversity_24h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers24h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.772 0.212 0.252 0.888 0.168 0.692 0.304 0.264 0.392 0.9 0.124 0.116 0.112 0 0.98 0.172 0.652 0.224 0.984 0.968 ...
# check preferred models
anova(divers24h_H0, divers24h_H1)
anova(divers24h_H0, divers24h_H2)
anova(divers24h_H0, divers24h_H3)
anova(divers24h_H2, divers24h_H2_1)
AIC(divers24h_H0, divers24h_H1, divers24h_H2, divers24h_H2_1, divers24h_H3) %>% arrange(AIC)
BIC(divers24h_H0, divers24h_H1, divers24h_H2, divers24h_H2_1, divers24h_H3) %>% arrange(BIC)
# H3 is the preferred model
summary(divers24h_H3)
## Family: lognormal ( log )
## Formula:
## Diversity_scalePLUSepsilon ~ Heat * Trtmt_Day * protegens + CommRich
## Data: diversity_24h
##
## AIC BIC logLik deviance df.resid
## -983.0 -932.6 505.5 -1011.0 256
##
##
## Dispersion parameter for lognormal family (): 1.42
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.772211 0.332908 -11.331 < 2e-16
## Heatcontrol 2.367052 0.272670 8.681 < 2e-16
## Trtmt_Dayrecov_2 -1.171502 0.389847 -3.005 0.00266
## Trtmt_Dayresist 2.176447 0.273740 7.951 1.85e-15
## protegens 0.001435 0.298806 0.005 0.99617
## CommRich 0.856841 0.078291 10.944 < 2e-16
## Heatcontrol:Trtmt_Dayrecov_2 1.085193 0.422527 2.568 0.01022
## Heatcontrol:Trtmt_Dayresist -1.706848 0.303351 -5.627 1.84e-08
## Heatcontrol:protegens -3.637477 0.384456 -9.461 < 2e-16
## Trtmt_Dayrecov_2:protegens 0.189455 0.471746 0.402 0.68798
## Trtmt_Dayresist:protegens -3.905377 0.392261 -9.956 < 2e-16
## Heatcontrol:Trtmt_Dayrecov_2:protegens -0.732573 0.604728 -1.211 0.22574
## Heatcontrol:Trtmt_Dayresist:protegens 3.496338 0.514784 6.792 1.11e-11
##
## (Intercept) ***
## Heatcontrol ***
## Trtmt_Dayrecov_2 **
## Trtmt_Dayresist ***
## protegens
## CommRich ***
## Heatcontrol:Trtmt_Dayrecov_2 *
## Heatcontrol:Trtmt_Dayresist ***
## Heatcontrol:protegens ***
## Trtmt_Dayrecov_2:protegens
## Trtmt_Dayresist:protegens ***
## Heatcontrol:Trtmt_Dayrecov_2:protegens
## Heatcontrol:Trtmt_Dayresist:protegens ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
divers_predict <- cbind(divers24h_H3$frame,
predict(divers24h_H3, type="response"))
colnames(divers_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(divers_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
# cleanup
rm(divers_predict)
####################
# 48h heat duration
####################
# let's see if we can now look at the resistance too?
diversity_48h <- rbind(diversity_forFit %>% filter(Heat == "48", Day > 2),
diversity_forFit %>% filter(Heat == "control", Day > 2))
# create a column for last day of heat, first day of recovery, and last day of recovery
diversity_48h$Trtmt_Day <- "resist"
diversity_48h$Trtmt_Day[diversity_48h$Day == 4] <- "recov_1"
diversity_48h$Trtmt_Day[diversity_48h$Day == 5] <- "recov_2"
# add small value to diversity
diversity_48h$Diversity_scalePLUSepsilon <- diversity_48h$Diversity_scale + smallest_diversity/100
divers48h_H0 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day,
data = diversity_48h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers48h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.48 0.416 0.372 0.476 0.456 0.416 0.452 0.476 0.452 0.464 0.484 0.556 0.404 0.392 0.424 0.304 0.296 0.34 0.328 0.84 ...
divers48h_H1 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat + putida,
data = diversity_48h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers48h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.372 0.312 0.484 0.592 0.288 0.244 0.2 0.572 0.348 0.38 0.368 0.46 0.312 0.496 0.528 0.244 0.256 0.284 0.48 0.844 ...
divers48h_H2 <- glmmTMB(Diversity_scalePLUSepsilon ~ CommRich*Heat*Trtmt_Day + protegens,
data = diversity_48h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 1e6, eval.max = 1e6)))
simulateResiduals(fittedModel = divers48h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.504 0.468 0.408 0.52 0.68 0.648 0.764 0.496 0.208 0.232 0.84 0.88 0.224 0.8 0.804 0.616 0.624 0.032 0.584 0.952 ...
divers48h_H2_1 <- glmmTMB(Diversity_scalePLUSepsilon ~ protegens + Heat + CommRich,
data = diversity_48h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
divers48h_H3 <- glmmTMB(Diversity_scalePLUSepsilon ~ Heat*Trtmt_Day*protegens + CommRich,
data = diversity_48h,
family = lognormal,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = divers48h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.556 0.52 0.476 0.584 0.384 0.316 0.172 0.56 0.472 0.476 0.476 0.552 0.444 0.384 0.42 0.232 0.248 0.316 0.272 0.9 ...
# check preferred models
anova(divers48h_H0, divers48h_H1)
anova(divers48h_H0, divers48h_H2)
anova(divers48h_H2, divers48h_H2_1)
anova(divers48h_H0, divers48h_H3)
AIC(divers48h_H0, divers48h_H1, divers48h_H2, divers48h_H2_1, divers48h_H3) %>% arrange(AIC)
BIC(divers48h_H0, divers48h_H1, divers48h_H2, divers48h_H2_1, divers48h_H3) %>% arrange(BIC)
# create data.frame for plotting
divers_predict <- cbind(divers48h_H3$frame,
predict(divers48h_H3, type="response"))
colnames(divers_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(divers_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Shannon diversity (rescaled)",
colour="CommRich")
# cleanup
rm(divers_predict)
############################
# effect sizes
############################
## remember that we need to correct for multiple comparisons that were generated by subsetting the data into 4 parts
## THEREFORE CONSIDER ALPHA/N = 0.05/4 = 0.0125 as the threshold for significance
emm_6h <- emmeans(divers6h_H3, ~ Heat | CommRich + Trtmt_Day*protegens, data = diversity_6h, type = "response")
effect_6h <- eff_size(emm_6h, sigma(divers6h_H3), edf = df.residual(divers6h_H3))
emm_12h <- emmeans(divers12h_H3, ~ Heat | CommRich + Trtmt_Day*protegens, data = diversity_12h, type = "response")
effect_12h <- eff_size(emm_12h, sigma(divers12h_H3), edf = df.residual(divers12h_H3))
emm_24h <- emmeans(divers24h_H3, ~ Heat | CommRich + Trtmt_Day*protegens, data = diversity_24h, type = "response")
effect_24h <- eff_size(emm_24h, sigma(divers24h_H3), edf = df.residual(divers24h_H3))
emm_48h <- emmeans(divers48h_H3, ~ Heat | CommRich + Trtmt_Day*protegens, data = diversity_48h, type = "response")
effect_48h <- eff_size(emm_48h, sigma(divers48h_H3), edf = df.residual(divers48h_H3))
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
div_effects_protegens <- data.frame()
div_effects_protegens <- rbind(div_effects_protegens,
get_effsize_CIs(effect_6h, heat_trtmt = 6),
get_effsize_CIs(effect_12h, heat_trtmt = 12),
get_effsize_CIs(effect_24h, heat_trtmt = 24),
get_effsize_CIs(effect_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects_protegens$Trtmt_Day <- factor(div_effects_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot conditional part of the model
ggplot(div_effects_protegens,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day, shape = as.logical(protegens))) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(title = "(diversity + x)",
x = "Effect Size on Shannon Diversity",
shape = "protegens\npresent?",
y="Heat duration")
#oh, wow this is why the 3rd model is highly preferred:
# there's a strong interaction between protegens & heat
# we can do a posthoc on this to illustrate statistically significant effects
posthoc_6h <- emmeans(effect_6h, pairwise ~ Trtmt_Day*protegens, data = diversity_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h, pairwise ~ Trtmt_Day*protegens, data = diversity_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h, pairwise ~ Trtmt_Day*protegens, data = diversity_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h, pairwise ~ Trtmt_Day*protegens, data = diversity_48h)
## NOTE: Results may be misleading due to involvement in interactions
# create a data.frame for plotting
div_effects <- data.frame()
div_effects <- rbind(div_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24),
get_posthoc(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects$Trtmt_Day <- factor(div_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot with group labels
ggplot(div_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day, shape=as.logical(protegens))) +
facet_grid(~protegens) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-2.5, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Shannon Diversity",
y="Heat duration",
title = "(diversity + x)")
# anyway we can still average over the effect of protegens
# we can do a posthoc on this to illustrate statistically significant effects
posthoc_6h <- emmeans(effect_6h, pairwise ~ Trtmt_Day, data = diversity_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h, pairwise ~ Trtmt_Day, data = diversity_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h, pairwise ~ Trtmt_Day, data = diversity_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h, pairwise ~ Trtmt_Day, data = diversity_48h)
## NOTE: Results may be misleading due to involvement in interactions
# a function that extracts the confidence intervals from a post-hoc object
get_posthocTEMP <- function(posthoc_object, heat_trtmt) {
output <- multcomp::cld(posthoc_object, alpha=0.05/4, Letters = letters) %>%
data.frame() %>%
select(-df)
colnames(output)[2:6] <- c("est", "SE", "loCI", "hiCI", "groups")
output$Heat <- heat_trtmt
return(output)
}
# create a data.frame for plotting
div_effects <- data.frame()
div_effects <- rbind(div_effects,
get_posthocTEMP(posthoc_6h, heat_trtmt = 6),
get_posthocTEMP(posthoc_12h, heat_trtmt = 12),
get_posthocTEMP(posthoc_24h, heat_trtmt = 24),
get_posthocTEMP(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects$Trtmt_Day <- factor(div_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot with group labels
ggplot(div_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day)) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-2.5, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Shannon Diversity",
y="Heat duration",
title = "averaged over protegens (diversity + x)")
# cleanup
rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h,
divers12h_H0, divers12h_H1, divers12h_H2, emm_12h, effect_12h,
divers24h_H0, divers24h_H1, divers24h_H2, emm_24h, effect_24h,
divers48h_H0, divers48h_H1, divers48h_H2, emm_48h, effect_48h,
divers6h_H2_1, divers12h_H2_1, divers24h_H2_1, divers48h_H2_1,
diversity_6h, diversity_12h, diversity_24h, diversity_48h,
div_effects_protegens, divEffects_ttests,
effect_6h_zi, effect_12h_zi, effect_24h_zi, effect_48h_zi,
posthoc_6h, posthoc_12h, posthoc_24h, posthoc_48h,
get_ZI_effsize_CIs, div_ZIeff,
div_effects, div_overall_effects, diversity_forFit)
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'divEffects_ttests' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'effect_6h_zi' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'effect_12h_zi' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'effect_24h_zi' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'effect_48h_zi' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'get_ZI_effsize_CIs' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'div_ZIeff' not found
## Warning in rm(divers6h_H0, divers6h_H1, divers6h_H2, emm_6h, effect_6h, :
## object 'div_overall_effects' not found
The Shannon diversity result above is annoying because it’s not clear to me why diversity decreases again at 48h. Let’s try other measurements of diversity like species richness (or evenness???). One problem is that the flow cytometry data has a low rate of misclassification (maybe up to 10%???). So we need to use richness estimates that take into account the proportion of species and are more likely to ignore rare species.
# vegan uses Pielou’s evenness
# but I think this value is going to be crappy because my whole problem is that I can't reliably estimate species number (aka richness) from the flow cytometry data
test <- absDensity$Diversity/log(specnumber(absDensity[,c("Conc_putida", "Conc_protegens", "Conc_grimontii", "Conc_veronii")]))
summary(test)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0000 0.0092 0.0647 0.3112 0.6489 0.9995 796
# yeahhh this is not working properly...
# we can also try Simpson's diversity index??
summary(diversity(absDensity[,c("Conc_putida", "Conc_protegens", "Conc_grimontii", "Conc_veronii")], index="simpson")) ## this looks a lot like Shannon diversity?
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.00000 0.00000 0.00049 0.11324 0.07790 1.00000 72
summary(diversity(absDensity[,c("Conc_putida", "Conc_protegens", "Conc_grimontii", "Conc_veronii")], index="invsimpson")) ## gives many Inf values. This is not good!
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.000 1.000 1.000 Inf 1.084 Inf 72
# let's try having the relative abundances and using a different package
library(chemodiv)
test.df <- absDen_forFit %>% mutate(relden_putida = Conc_putida/Total_density,
relden_protegens = Conc_protegens/Total_density,
relden_grimontii = Conc_grimontii/Total_density,
relden_veronii = Conc_veronii/Total_density)
test.df <- test.df %>% mutate(HillEven_q0 = unlist(calcDiv(sampleData = test.df[,c("relden_putida","relden_protegens","relden_grimontii","relden_veronii")],
type = "HillEven",
q=0)),
HillEven_q1 = unlist(calcDiv(sampleData = test.df[,c("relden_putida","relden_protegens","relden_grimontii","relden_veronii")],
type = "HillEven",
q=1)),
HillEven_q2 = unlist(calcDiv(sampleData = test.df[,c("relden_putida","relden_protegens","relden_grimontii","relden_veronii")],
type = "HillEven",
q=2)),
HillDiv_q1 = unlist(calcDiv(sampleData = test.df[,c("relden_putida","relden_protegens","relden_grimontii","relden_veronii")],
type = "HillDiv",
q=1)),
HillDiv_q2 = unlist(calcDiv(sampleData = test.df[,c("relden_putida","relden_protegens","relden_grimontii","relden_veronii")],
type = "HillDiv",
q=2)))
ggplot(test.df %>% filter(CommRich > 1),
aes(y=HillEven_q0, x=Day, colour=as.factor(CommRich))) +
facet_grid(protegens~as.factor(Heat)) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "viridis", begin=0.1)
## Warning: Removed 28 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(test.df %>% filter(CommRich > 1),
aes(y=HillEven_q1, x=Day, colour=as.factor(CommRich))) +
facet_grid(protegens~as.factor(Heat)) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "viridis", begin=0.1)
## Warning: Removed 28 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(test.df %>% filter(CommRich > 1),
aes(y=HillEven_q2, x=Day, colour=as.factor(CommRich))) +
facet_grid(protegens~as.factor(Heat)) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "viridis", begin=0.1)
## Warning: Removed 28 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(test.df,
aes(y=HillDiv_q1, x=Day, colour=as.factor(CommRich))) +
facet_grid(protegens~as.factor(Heat)) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "viridis")
## Warning: Removed 56 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(test.df,
aes(y=HillDiv_q2, x=Day, colour=as.factor(CommRich))) +
facet_grid(protegens~as.factor(Heat)) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "viridis")
## Warning: Removed 56 rows containing missing values or values outside the scale range
## (`geom_point()`).
# this one gives infinite values. That's not useful.
# The Hill Diversity with q=1 seems useful!! Let's analyze this one:
absDen_forFit$HillDiv_q1 <- test.df$HillDiv_q1
## BUT NOTE THAT VALUES OF 1.0 can either indicate monocultures OR complete extinction
richness_forFit <- absDen_forFit %>% select(-Conc_putida, -Conc_protegens, -Conc_grimontii, -Conc_veronii)
# let's keep the monocultures but add back the information about extinction s.t. the value of richness is 0.01 when a replicate has gone extinct
# (this way we don't have to deal with 0's)
richness_forFit$HillDiv_q1[which(richness_forFit$Total_density == 0)] <- 0.01
# make heat into a factor with 0 as control
richness_forFit$Heat <- factor(richness_forFit$Heat,
levels = c("0", "6", "12", "24", "48"))
levels(richness_forFit$Heat)[1] <- "control"
# !!! emmeans expects the control to be the very *last* level !!!
richness_forFit$Heat <- factor(richness_forFit$Heat,
levels = c("6", "12", "24", "48", "control"))
# add a column indicating whether the replicate survived
richness_forFit <- inner_join(richness_forFit,
extinct.df %>% select(uniqID, Heat, survived),
by = c("uniqID", "Heat"))
# scale the data by its standard deviation
richness_forFit$HillDivq1_scale <- scale(richness_forFit$HillDiv_q1,
scale = sd(richness_forFit$HillDiv_q1, na.rm = TRUE),
center = FALSE)
summary(richness_forFit$HillDivq1_scale)
## V1
## Min. :0.03933
## 1st Qu.:3.93314
## Median :3.93314
## Mean :4.07449
## 3rd Qu.:3.98035
## Max. :9.15103
try_gaussian <- glmmTMB(HillDivq1_scale ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_gaussian, plot = TRUE)
## qu = 0.75, log(sigma) = -2.258662 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 1 0.284 0.316 0.264 0.332 0.268 1 1 0.328 0.32 0.94 0.32 0.284 1 0.304 0.3 0.32 0.328 0.332 0.32 ...
try_LOGgaussian <- glmmTMB(log(HillDivq1_scale) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_LOGgaussian, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.744 0.412 0.396 0.364 0.432 0.356 0.672 0.772 0.408 0.388 0.58 0.444 0.42 0.776 0.384 0.392 0.396 0.424 0.404 0.42 ...
try_gamma <- glmmTMB(HillDivq1_scale ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = ziGamma,
ziformula = ~1, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
simulateResiduals(fittedModel = try_gamma, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.936 0.464 0.472 0.488 0.4 0.46 0.82 0.916 0.48 0.4 0.784 0.432 0.508 0.984 0.464 0.464 0.528 0.456 0.464 0.456 ...
try_lognorm <- glmmTMB(HillDivq1_scale ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = lognormal,
ziformula = ~1, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = try_lognorm, plot = TRUE)
## qu = 0.75, log(sigma) = -3.001108 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.138081 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.988 0.24 0.192 0.28 0.192 0.252 0.924 0.968 0.2 0.24 0.744 0.224 0.244 1 0.216 0.26 0.252 0.24 0.212 0.2 ...
try_LOGlognorm <- glmmTMB(log(HillDivq1_scale+1) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = lognormal,
ziformula = ~1, # I'm keeping this as 0-inflated lognormal alone was already over-dispersed. So I want to see if the log(x+1) transformation sufficiently brings in the long tail.
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = try_LOGlognorm, plot = TRUE)
## qu = 0.75, log(sigma) = -3.41335 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.804 0.264 0.212 0.292 0.212 0.276 0.668 0.804 0.232 0.272 0.54 0.248 0.26 0.86 0.236 0.288 0.284 0.268 0.236 0.212 ...
try_negbinom <- glmmTMB(as.integer(HillDivq1_scale*100) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = nbinom2,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_negbinom, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.916 0.456 0.416 0.425595 0.5083581 0.4 0.868 0.932 0.52 0.448 0.7867902 0.484 0.4699888 0.96 0.4434514 0.4073784 0.468141 0.444 0.4880899 0.484 ...
try_negbinom0 <- glmmTMB(as.integer(HillDivq1_scale*100) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = nbinom2,
ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_negbinom0, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## qu = 0.75, log(sigma) = -2.96785 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.936 0.4590895 0.464 0.4261037 0.516 0.456 0.884 0.952 0.424 0.468 0.796 0.428 0.4177094 0.992 0.412 0.376 0.46 0.488 0.4363978 0.46 ...
try_poisson <- glmmTMB(as.integer(HillDivq1_scale*100) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = genpois,
#ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_poisson, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## qu = 0.75, log(sigma) = -2.861767 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.554883 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.9 0.316 0.2361569 0.332 0.252 0.3002625 0.844 0.916 0.2733984 0.3 0.704 0.284 0.276 0.976 0.268 0.2741443 0.336 0.2873553 0.296 0.3587116 ...
try_poisson0 <- glmmTMB(as.integer(HillDivq1_scale*100) ~ CommRich:Day + Heat:Day + CommRich:Heat:Day,
data = richness_forFit,
family = genpois,
ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_poisson0, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## qu = 0.75, log(sigma) = -3.369021 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.347646 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.339481 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.884 0.252 0.288 0.224 0.2764622 0.256 0.856 0.896 0.292 0.316 0.676 0.3035417 0.2731302 0.976 0.276 0.2785247 0.308 0.292 0.3011322 0.284 ...
# clean up
rm(try_gaussian, try_LOGgaussian, try_gamma, try_lognorm, try_LOGlognorm, try_negbinom, try_negbinom0, try_poisson, try_poisson0)
# amusingly enough it seems that the good ol' Gaussian is our best choice.
# good thing that it's also the most robust to model violations *shrug*
####################
# 6h heat duration
####################
# grab just the treatment with its associated control data
richness_6h <- rbind(richness_forFit %>% filter(Heat == "6"),
richness_forFit %>% filter(Heat == "control", Day < 4))
# create a column for last day of heat, first day of recovery, and last day of recovery
richness_6h$Trtmt_Day <- "resist"
richness_6h$Trtmt_Day[richness_6h$Day == 2] <- "recov_1"
richness_6h$Trtmt_Day[richness_6h$Day == 3] <- "recov_2"
rich6h_H0 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day,
data = richness_6h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich6h_H0, plot = TRUE)
## qu = 0.75, log(sigma) = -2.420495 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.508 0.604 0.48 0.524 0.396 0.4 0.496 0.468 0.512 0.484 0.412 0.432 0.46 0.468 0.576 0.408 0.388 0.804 0.488 0.392 ...
# note that putida is both in the zero inflation & fixed effect
rich6h_H1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + putida,
data = richness_6h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich6h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.48 0.556 0.42 0.488 0.412 0.408 0.508 0.404 0.448 0.504 0.44 0.444 0.4 0.424 0.536 0.484 0.428 0.82 0.452 0.364 ...
rich6h_H2 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = richness_6h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich6h_H2, plot = TRUE)
## qu = 0.75, log(sigma) = -2.644789 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.144 0.332 0.196 0.752 0.26 0.264 0.828 0.296 0.364 0.828 0.228 0.26 0.724 0.204 0.352 0.628 0.68 0.604 0.604 0.568 ...
# unfortunately this also open up the possibility of interactions in fixed effects
rich6h_H2_1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = richness_6h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
rich6h_H3 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day*protegens,
data = richness_6h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich6h_H3, plot = TRUE)
## qu = 0.75, log(sigma) = -2.826234 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.188 0.504 0.324 0.628 0.372 0.348 0.54 0.436 0.468 0.536 0.352 0.396 0.504 0.26 0.504 0.496 0.432 0.88 0.536 0.448 ...
# check preferred models
anova(rich6h_H0, rich6h_H1)
anova(rich6h_H0, rich6h_H2)
anova(rich6h_H2, rich6h_H2_1)
AIC(rich6h_H0, rich6h_H1, rich6h_H2, rich6h_H2_1, rich6h_H3) %>% arrange(AIC)
BIC(rich6h_H0, rich6h_H1, rich6h_H2, rich6h_H2_1, rich6h_H3) %>% arrange(BIC)
# rich6h_H2_1 is the preferred model
summary(rich6h_H2_1)
## Family: gaussian ( identity )
## Formula:
## HillDivq1_scale ~ CommRich * Heat * Trtmt_Day + protegens * CommRich
## Data: richness_6h
##
## AIC BIC logLik deviance df.resid
## 838.1 897.7 -404.0 808.1 378
##
##
## Dispersion estimate for gaussian family (sigma^2): 0.458
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.23039 0.22702 14.230 < 2e-16 ***
## CommRich 0.80882 0.10714 7.549 4.37e-14 ***
## Heatcontrol -0.07924 0.32313 -0.245 0.806273
## Trtmt_Dayrecov_2 -0.07773 0.28899 -0.269 0.787961
## Trtmt_Dayresist -0.13791 0.28899 -0.477 0.633217
## protegens 0.71413 0.19749 3.616 0.000299 ***
## CommRich:Heatcontrol 0.20607 0.13755 1.498 0.134097
## CommRich:Trtmt_Dayrecov_2 0.01036 0.12493 0.083 0.933937
## CommRich:Trtmt_Dayresist 0.02154 0.12493 0.172 0.863088
## Heatcontrol:Trtmt_Dayrecov_2 0.02480 0.45543 0.054 0.956569
## Heatcontrol:Trtmt_Dayresist 0.34657 0.45543 0.761 0.446666
## CommRich:protegens -0.87966 0.09092 -9.675 < 2e-16 ***
## CommRich:Heatcontrol:Trtmt_Dayrecov_2 -0.08569 0.19410 -0.441 0.658874
## CommRich:Heatcontrol:Trtmt_Dayresist 0.01475 0.19410 0.076 0.939423
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
rich_predict <- cbind(rich6h_H2_1$frame,
predict(rich6h_H2_1, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title = "rich6h_H2_1")
# cleanup
rm(rich_predict)
# create data.frame for plotting
rich_predict <- cbind(rich6h_H3$frame,
predict(rich6h_H3, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title = "rich6h_H3")
# cleanup
rm(rich_predict)
####################
# 12h heat duration
####################
# grab just the treatment with its associated control data
richness_12h <- rbind(richness_forFit %>% filter(Heat == "12", Day > 1),
richness_forFit %>% filter(Heat == "control", Day > 1, Day != 5))
# create a column for last day of heat, first day of recovery, and last day of recovery
richness_12h$Trtmt_Day <- "resist"
richness_12h$Trtmt_Day[richness_12h$Day == 3] <- "recov_1"
richness_12h$Trtmt_Day[richness_12h$Day == 4] <- "recov_2"
rich12h_H0 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day,
data = richness_12h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich12h_H0, plot = TRUE)
## qu = 0.5, log(sigma) = -3.256779 : outer Newton did not converge fully.
## qu = 0.5, log(sigma) = -3.286288 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.006775 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.428 0.42 0.412 0.368 0.328 0.396 0.368 0.352 0.416 0.388 0.36 1 0.404 0.452 0.368 0.468 0.344 0.384 0.388 0.46 ...
rich12h_H1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + CommRich*putida,
data = richness_12h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich12h_H1, plot = TRUE)
## qu = 0.75, log(sigma) = -2.277833 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.4 0.388 0.376 0.392 0.352 0.428 0.4 0.332 0.4 0.384 0.356 1 0.408 0.452 0.356 0.488 0.38 0.36 0.368 0.412 ...
rich12h_H2 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = richness_12h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich12h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.136 0.14 0.624 0.62 0.608 0.208 0.196 0.54 0.184 0.584 0.524 1 0.516 0.52 0.472 0.472 0.192 0.556 0.128 0.192 ...
rich12h_H2_1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = richness_12h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
rich12h_H3 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day*protegens,
data = richness_12h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich12h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.164 0.156 0.54 0.468 0.444 0.308 0.268 0.444 0.224 0.504 0.436 1 0.476 0.504 0.432 0.488 0.268 0.46 0.16 0.216 ...
# check preferred models
anova(rich12h_H0, rich12h_H1)
anova(rich12h_H0, rich12h_H2)
anova(rich12h_H2, rich12h_H2_1)
AIC(rich12h_H0, rich12h_H1, rich12h_H2, rich12h_H2_1, rich12h_H3) %>% arrange(AIC)
BIC(rich12h_H0, rich12h_H1, rich12h_H2, rich12h_H2_1, rich12h_H3) %>% arrange(BIC)
# H3 is the preferred model (even if its residuals are the worst)
summary(rich12h_H3)
## Family: gaussian ( identity )
## Formula: HillDivq1_scale ~ CommRich * Heat * Trtmt_Day * protegens
## Data: richness_12h
##
## AIC BIC logLik deviance df.resid
## 653.5 750.2 -301.7 603.5 328
##
##
## Dispersion estimate for gaussian family (sigma^2): 0.324
##
## Conditional model:
## Estimate Std. Error z value
## (Intercept) 4.093e+00 3.474e-01 11.783
## CommRich 1.154e-01 1.746e-01 0.660
## Heatcontrol -1.220e+00 4.914e-01 -2.482
## Trtmt_Dayrecov_2 2.090e-06 4.913e-01 0.000
## Trtmt_Dayresist 2.153e-02 4.913e-01 0.044
## protegens -1.713e-01 4.429e-01 -0.387
## CommRich:Heatcontrol 9.455e-01 2.471e-01 3.826
## CommRich:Trtmt_Dayrecov_2 1.036e-02 2.470e-01 0.042
## CommRich:Trtmt_Dayresist 5.245e-02 2.470e-01 0.212
## Heatcontrol:Trtmt_Dayrecov_2 -9.058e-02 7.127e-01 -0.127
## Heatcontrol:Trtmt_Dayresist -7.394e-01 6.950e-01 -1.064
## CommRich:protegens -1.015e-01 2.032e-01 -0.500
## Heatcontrol:protegens 1.224e+00 6.309e-01 1.940
## Trtmt_Dayrecov_2:protegens -5.214e-02 6.264e-01 -0.083
## Trtmt_Dayresist:protegens -1.012e-02 6.264e-01 -0.016
## CommRich:Heatcontrol:Trtmt_Dayrecov_2 5.925e-02 3.597e-01 0.165
## CommRich:Heatcontrol:Trtmt_Dayresist 6.388e-01 3.494e-01 1.828
## CommRich:Heatcontrol:protegens -9.470e-01 2.899e-01 -3.267
## CommRich:Trtmt_Dayrecov_2:protegens 2.936e-02 2.874e-01 0.102
## CommRich:Trtmt_Dayresist:protegens -6.343e-02 2.874e-01 -0.221
## Heatcontrol:Trtmt_Dayrecov_2:protegens 1.215e-01 9.109e-01 0.133
## Heatcontrol:Trtmt_Dayresist:protegens 7.118e-01 8.923e-01 0.798
## CommRich:Heatcontrol:Trtmt_Dayrecov_2:protegens -9.275e-02 4.210e-01 -0.220
## CommRich:Heatcontrol:Trtmt_Dayresist:protegens -6.216e-01 4.099e-01 -1.516
## Pr(>|z|)
## (Intercept) < 2e-16 ***
## CommRich 0.50895
## Heatcontrol 0.01308 *
## Trtmt_Dayrecov_2 1.00000
## Trtmt_Dayresist 0.96505
## protegens 0.69888
## CommRich:Heatcontrol 0.00013 ***
## CommRich:Trtmt_Dayrecov_2 0.96655
## CommRich:Trtmt_Dayresist 0.83183
## Heatcontrol:Trtmt_Dayrecov_2 0.89886
## Heatcontrol:Trtmt_Dayresist 0.28738
## CommRich:protegens 0.61734
## Heatcontrol:protegens 0.05237 .
## Trtmt_Dayrecov_2:protegens 0.93366
## Trtmt_Dayresist:protegens 0.98711
## CommRich:Heatcontrol:Trtmt_Dayrecov_2 0.86918
## CommRich:Heatcontrol:Trtmt_Dayresist 0.06755 .
## CommRich:Heatcontrol:protegens 0.00109 **
## CommRich:Trtmt_Dayrecov_2:protegens 0.91861
## CommRich:Trtmt_Dayresist:protegens 0.82529
## Heatcontrol:Trtmt_Dayrecov_2:protegens 0.89386
## Heatcontrol:Trtmt_Dayresist:protegens 0.42501
## CommRich:Heatcontrol:Trtmt_Dayrecov_2:protegens 0.82563
## CommRich:Heatcontrol:Trtmt_Dayresist:protegens 0.12942
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
rich_predict <- cbind(rich12h_H3$frame,
predict(rich12h_H3, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title="rich12h_H3")
# cleanup
rm(rich_predict)
# create data.frame for plotting
rich_predict <- cbind(rich12h_H2_1$frame,
predict(rich12h_H2_1, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title="rich12h_H2_1")
# cleanup
rm(rich_predict)
####################
# 24h heat duration
####################
# grab just the treatment with its associated control data
richness_24h <- rbind(richness_forFit %>% filter(Heat == "24", Day > 1),
richness_forFit %>% filter(Heat == "control", Day > 1, Day != 5))
# create a column for last day of heat, first day of recovery, and last day of recovery
richness_24h$Trtmt_Day <- "resist"
richness_24h$Trtmt_Day[richness_24h$Day == 3] <- "recov_1"
richness_24h$Trtmt_Day[richness_24h$Day == 4] <- "recov_2"
rich24h_H0 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day,
data = richness_24h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich24h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.78 0.492 0.416 0.476 0.4 0.396 0.432 0.392 0.476 0.764 0.424 0.344 0.604 0.42 0.396 0.696 0.416 0.312 0.42 0.504 ...
rich24h_H1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + putida,
data = richness_24h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich24h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.748 0.472 0.396 0.488 0.424 0.38 0.432 0.38 0.448 0.748 0.444 0.384 0.636 0.408 0.376 0.68 0.444 0.304 0.404 0.48 ...
rich24h_H2 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = richness_24h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich24h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.596 0.284 0.584 0.32 0.708 0.292 0.664 0.568 0.256 0.592 0.584 0.532 0.34 0.512 0.468 0.416 0.472 0.308 0.572 0.296 ...
rich24h_H2_1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = richness_24h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
rich24h_H3 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day*protegens,
data = richness_24h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich24h_H3, plot = TRUE)
## qu = 0.25, log(sigma) = -3.588259 : outer Newton did not converge fully.
## qu = 0.25, log(sigma) = -3.838263 : outer Newton did not converge fully.
## qu = 0.25, log(sigma) = -3.826341 : outer Newton did not converge fully.
## Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
## : Fitting terminated with step failure - check results carefully
## qu = 0.75, log(sigma) = -3.066158 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.067873 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.068028 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -3.067751 : outer Newton did not converge fully.
## Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
## : Fitting terminated with step failure - check results carefully
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.62 0.272 0.488 0.58 0.512 0.556 0.476 0.472 0.232 0.616 0.512 0.452 0.34 0.54 0.488 0.14 0.504 0.412 0.512 0.268 ...
# check preferred models
anova(rich24h_H0, rich24h_H1)
anova(rich24h_H0, rich24h_H2)
anova(rich24h_H2, rich24h_H2_1)
AIC(rich24h_H0, rich24h_H1, rich24h_H2, rich24h_H2_1, rich24h_H3) %>% arrange(AIC)
BIC(rich24h_H0, rich24h_H1, rich24h_H2, rich24h_H2_1, rich24h_H3) %>% arrange(BIC)
# create data.frame for plotting
rich_predict <- cbind(rich24h_H3$frame,
predict(rich24h_H3, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title="rich24h_H3")
# cleanup
rm(rich_predict)
# create data.frame for plotting
rich_predict <- cbind(rich24h_H2_1$frame,
predict(rich24h_H2_1, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title="rich24h_H2_1")
# cleanup
rm(rich_predict)
####################
# 48h heat duration
####################
# grab just the treatment with its associated control data
richness_48h <- rbind(richness_forFit %>% filter(Heat == "48", Day > 2),
richness_forFit %>% filter(Heat == "control", Day > 2))
# create a column for last day of heat, first day of recovery, and last day of recovery
richness_48h$Trtmt_Day <- "resist"
richness_48h$Trtmt_Day[richness_48h$Day == 4] <- "recov_1"
richness_48h$Trtmt_Day[richness_48h$Day == 5] <- "recov_2"
# drop the resistance data altogether from 48h treatment because glmmTMB fails to converge, saying that there's a Non-positive definite (NPD) Hessian
# Running diagnose(<model>) tells us that the likelihood surface is flat near the MLE and that this is happening for parameters Trtmt_Dayresist, CommRich:Trtmt_Dayresist, Heatcontrol:Trtmt_Dayresist, and CommRich:Heatcontrol:Trtmt_Dayresist
# This is likely because of all the NA values during the last day of heat for this longest duration
#richness_48h <- richness_48h %>% filter(Trtmt_Day != "resist")
rich48h_H0 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day,
data = richness_48h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich48h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.508 0.464 0.54 0.464 0.484 0.508 0.476 0.52 0.484 0.528 0.488 0.436 0.484 0.456 0.496 0.46 0.532 0.496 0.516 0.54 ...
rich48h_H1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + putida,
data = richness_48h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich48h_H1, plot = TRUE)
## qu = 0.5, log(sigma) = -2.770627 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.444 0.496 0.432 0.504 0.412 0.568 0.536 0.476 0.452 0.524 0.364 0.496 0.424 0.512 0.456 0.444 0.604 0.488 0.564 0.608 ...
rich48h_H2 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens,
data = richness_48h, # this is the difference!!!
control = glmmTMBControl(optCtrl = list(iter.max = 1e6, eval.max = 1e6)))
simulateResiduals(fittedModel = rich48h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.476 0.436 0.616 0.424 0.468 0.492 0.464 0.516 0.476 0.528 0.568 0.424 0.46 0.436 0.488 0.452 0.516 0.496 0.492 0.508 ...
rich48h_H2_1 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day + protegens*CommRich,
data = richness_48h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
rich48h_H3 <- glmmTMB(HillDivq1_scale ~ CommRich*Heat*Trtmt_Day*protegens,
data = richness_48h,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = rich48h_H3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.508 0.464 0.54 0.464 0.484 0.508 0.476 0.52 0.484 0.528 0.488 0.436 0.484 0.456 0.496 0.46 0.532 0.496 0.516 0.54 ...
# check preferred models
anova(rich48h_H0, rich48h_H1)
anova(rich48h_H0, rich48h_H2)
anova(rich48h_H2, rich48h_H2_1)
AIC(rich48h_H0, rich48h_H1, rich48h_H2, rich48h_H2_1, rich48h_H3) %>% arrange(AIC)
BIC(rich48h_H0, rich48h_H1, rich48h_H2, rich48h_H2_1, rich48h_H3) %>% arrange(BIC)
# create data.frame for plotting
rich_predict <- cbind(rich48h_H3$frame,
predict(rich48h_H3, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title="rich48h_H3")
# cleanup
rm(rich_predict)
# create data.frame for plotting
rich_predict <- cbind(rich48h_H2_1$frame,
predict(rich48h_H2_1, type="response"))
colnames(rich_predict)[c(1,6)] <- c("observed", "predicted")
# plot the model predictions against the data
ggplot(rich_predict,
aes(x=Trtmt_Day, y=observed, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Hill diversity (rescaled)",
colour="CommRich",
title="rich48h_H2_1")
# cleanup
rm(rich_predict)
## so let's use model H3 throughout: CommRich*Heat*Trtmt_Day*protegens
############################
# effect sizes
############################
## remember that we need to correct for multiple comparisons that were generated by subsetting the data into 4 parts
## THEREFORE CONSIDER ALPHA/N = 0.05/4 = 0.0125 as the threshold for significance
emm_6h <- emmeans(rich6h_H3, ~ Heat | CommRich*Trtmt_Day*protegens, data = richness_6h, type = "response")
effect_6h <- eff_size(emm_6h, sigma(rich6h_H3), edf = df.residual(rich6h_H3))
emm_12h <- emmeans(rich12h_H3, ~ Heat | CommRich*Trtmt_Day*protegens, data = richness_12h, type = "response")
effect_12h <- eff_size(emm_12h, sigma(rich12h_H3), edf = df.residual(rich12h_H3))
emm_24h <- emmeans(rich24h_H3, ~ Heat | CommRich*Trtmt_Day*protegens, data = richness_24h, type = "response")
effect_24h <- eff_size(emm_24h, sigma(rich24h_H3), edf = df.residual(rich24h_H3))
emm_48h <- emmeans(rich48h_H3, ~ Heat | CommRich*Trtmt_Day*protegens, data = richness_48h, type = "response")
effect_48h <- eff_size(emm_48h, sigma(rich48h_H3), edf = df.residual(rich48h_H3))
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
div_effects_protegens <- data.frame()
div_effects_protegens <- rbind(div_effects_protegens,
get_effsize_CIs(effect_6h, heat_trtmt = 6),
get_effsize_CIs(effect_12h, heat_trtmt = 12),
get_effsize_CIs(effect_24h, heat_trtmt = 24),
get_effsize_CIs(effect_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects_protegens$Trtmt_Day <- factor(div_effects_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot conditional part of the model
ggplot(div_effects_protegens,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day, shape = as.logical(protegens))) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Richness",
shape = "protegens\npresent?",
y="Heat duration")
#oh, wow this is why the 3rd model is highly preferred:
# there's a strong interaction between protegens & heat
# we can do a posthoc on this to illustrate statistically significant effects
posthoc_6h <- emmeans(effect_6h, pairwise ~ Trtmt_Day*protegens, data = richness_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h, pairwise ~ Trtmt_Day*protegens, data = richness_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h, pairwise ~ Trtmt_Day*protegens, data = richness_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h, pairwise ~ Trtmt_Day*protegens, data = richness_48h)
## NOTE: Results may be misleading due to involvement in interactions
# create a data.frame for plotting
div_effects <- data.frame()
div_effects <- rbind(div_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24),
get_posthoc(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects$Trtmt_Day <- factor(div_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot with group labels
ggplot(div_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day, shape=as.logical(protegens))) +
facet_grid(~protegens) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-2.5, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Richness",
y="Heat duration")
# anyway we can still average over the effect of protegens
# we can do a posthoc on this to illustrate statistically significant effects
posthoc_6h <- emmeans(effect_6h, pairwise ~ Trtmt_Day, data = richness_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h, pairwise ~ Trtmt_Day, data = richness_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h, pairwise ~ Trtmt_Day, data = richness_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h, pairwise ~ Trtmt_Day, data = richness_48h)
## NOTE: Results may be misleading due to involvement in interactions
# create a data.frame for plotting
div_effects <- data.frame()
div_effects <- rbind(div_effects,
get_posthocTEMP(posthoc_6h, heat_trtmt = 6),
get_posthocTEMP(posthoc_12h, heat_trtmt = 12),
get_posthocTEMP(posthoc_24h, heat_trtmt = 24),
get_posthocTEMP(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
div_effects$Trtmt_Day <- factor(div_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(div_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot with group labels
ggplot(div_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day)) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-2.5, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Richness",
y="Heat duration",
title = "averaged over protegens")
rm(div_effects, posthoc_6h, posthoc_12h, posthoc_24h, posthoc_48h,
div_effects_protegens, emm_6h, emm_12h, emm_24h, emm_48h, effect_6h, effect_12h,
effect_24h, effect_48h,
rich48h_H0, rich48h_H1, rich48h_H2, rich48h_H2_1, rich48h_H3,
rich24h_H0, rich24h_H1, rich24h_H2, rich24h_H2_1, rich24h_H3,
rich12h_H0, rich12h_H1, rich12h_H2, rich12h_H2_1, rich12h_H3,
rich6h_H0, rich6h_H1, rich6h_H2, rich6h_H2_1, rich6h_H3,
richness_6h, richness_12h, richness_24h, richness_48h)
## plot effect size of productivity (aka Total absolute density)
plot(ggplot(effectSize,
aes(x=Day,
y=TotDen_plusEpsilon_mean,
colour=CommRich,
group=community)) +
facet_grid(~Heat) +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
stat_summary(fun=mean, geom="line", alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.65) +
labs(title="All data", colour="Inoculated\nRichness",
y="Effect size on Total Density+epsilon"))
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_summary()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
plot(ggplot(effectSize %>%
filter(protegens==1),
aes(x=Day,
y=TotDen_plusEpsilon_mean,
colour=CommRich,
group=community)) +
facet_grid(~Heat) +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
stat_summary(fun=mean, geom="line", alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.65) +
labs(title="Protegens included", colour="Inoculated\nRichness",
y="Effect size on Total Density+epsilon"))
## Warning: Removed 1 row containing non-finite outside the scale range (`stat_summary()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
plot(ggplot(effectSize %>%
filter(protegens==0),
aes(x=Day,
y=TotDen_plusEpsilon_mean,
colour=CommRich,
group=community)) +
facet_grid(~Heat) +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
stat_summary(fun=mean, geom="line", alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.65) +
labs(title="Protegens excluded", colour="Inoculated\nRichness",
y="Effect size on Total Density+epsilon"))
Let’s fit this data to two types of linear models. 1) Log transform the Total Density + Epsilon data then fit a (Gaussian) linear model. And 2) use the Total Density data directly but fit a Negative Binomial linear model.
As above, we compare the fit of 3 different models.
#####################
# log10(Total Density + epsilon) transformed data
#####################
resist_prod0 <- with(effectSize %>% filter(Last_Heat_Day == TRUE) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + Heat + CommRich:Heat))
# Maddy's preferred model
print("RESISTANCE PRODUCTIVITY DATA. SUMMARY OF THE SIMPLEST MODEL:")
## [1] "RESISTANCE PRODUCTIVITY DATA. SUMMARY OF THE SIMPLEST MODEL:"
summary(resist_prod0)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + Heat + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.3849 -0.7118 -0.0127 0.6267 5.3632
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.3863 0.3674 1.051 0.29412
## CommRich2 -0.3739 0.4811 -0.777 0.43776
## CommRich3 -0.7859 0.5196 -1.512 0.13171
## CommRich4 -0.5822 0.8216 -0.709 0.47924
## Heat12 -1.5011 0.6000 -2.502 0.01300 *
## Heat24 -5.4835 0.5264 -10.416 < 2e-16 ***
## Heat48 -7.5351 0.5339 -14.114 < 2e-16 ***
## CommRich2:Heat12 1.1846 0.7460 1.588 0.11357
## CommRich3:Heat12 1.6744 0.7982 2.098 0.03694 *
## CommRich4:Heat12 1.9570 1.2001 1.631 0.10420
## CommRich2:Heat24 2.9796 0.6856 4.346 2.02e-05 ***
## CommRich3:Heat24 4.4573 0.7695 5.792 2.08e-08 ***
## CommRich4:Heat24 5.3235 1.2216 4.358 1.92e-05 ***
## CommRich2:Heat48 0.9364 0.6967 1.344 0.18015
## CommRich3:Heat48 2.2458 0.7673 2.927 0.00374 **
## CommRich4:Heat48 2.1132 1.2248 1.725 0.08570 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.643 on 250 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.741, Adjusted R-squared: 0.7254
## F-statistic: 47.68 on 15 and 250 DF, p-value: < 2.2e-16
resist_prod1 <- with(effectSize %>% filter(Last_Heat_Day == TRUE) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + withstands_heat + Heat + CommRich:withstands_heat + Heat:withstands_heat + CommRich:Heat))
print("RESISTANCE PRODUCTIVITY DATA. SUMMARY OF THE Putida MODEL:")
## [1] "RESISTANCE PRODUCTIVITY DATA. SUMMARY OF THE Putida MODEL:"
summary(resist_prod1)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + withstands_heat +
## Heat + CommRich:withstands_heat + Heat:withstands_heat +
## CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.4458 -0.7404 0.0092 0.6516 5.0282
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.08386 0.36971 0.227 0.820757
## CommRich2 0.10254 0.50079 0.205 0.837937
## CommRich3 -0.44615 0.61527 -0.725 0.469064
## CommRich4 -1.48955 0.87189 -1.708 0.088831 .
## withstands_heatTRUE 1.20982 0.55418 2.183 0.029983 *
## Heat12 -1.27369 0.57272 -2.224 0.027068 *
## Heat24 -4.84595 0.51316 -9.443 < 2e-16 ***
## Heat48 -7.33170 0.52166 -14.054 < 2e-16 ***
## CommRich2:withstands_heatTRUE -1.53462 0.53959 -2.844 0.004832 **
## CommRich3:withstands_heatTRUE -1.25951 0.61921 -2.034 0.043029 *
## CommRich4:withstands_heatTRUE NA NA NA NA
## withstands_heatTRUE:Heat12 -0.30924 0.61485 -0.503 0.615447
## withstands_heatTRUE:Heat24 -2.48307 0.57958 -4.284 2.64e-05 ***
## withstands_heatTRUE:Heat48 -0.85325 0.58518 -1.458 0.146097
## CommRich2:Heat12 1.11193 0.74020 1.502 0.134336
## CommRich3:Heat12 1.67421 0.83622 2.002 0.046378 *
## CommRich4:Heat12 2.03887 1.24114 1.643 0.101725
## CommRich2:Heat24 3.67227 0.65969 5.567 6.83e-08 ***
## CommRich3:Heat24 5.63988 0.77085 7.316 3.66e-12 ***
## CommRich4:Heat24 7.16908 1.21828 5.885 1.31e-08 ***
## CommRich2:Heat48 1.19328 0.66932 1.783 0.075858 .
## CommRich3:Heat48 2.68230 0.77028 3.482 0.000589 ***
## CommRich4:Heat48 2.76309 1.22104 2.263 0.024522 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.533 on 244 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.78, Adjusted R-squared: 0.7611
## F-statistic: 41.2 on 21 and 244 DF, p-value: < 2.2e-16
resist_prod2 <- with(effectSize %>% filter(Last_Heat_Day == TRUE) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + protegens + Heat + CommRich:protegens + Heat:protegens + CommRich:Heat))
# compare the nested models
anova(resist_prod0, resist_prod1)
anova(resist_prod0, resist_prod2)
# compare all 3 models
AIC(resist_prod0, resist_prod1, resist_prod2) %>% arrange(AIC)
BIC(resist_prod0, resist_prod1, resist_prod2) %>% arrange(BIC)
# the statistically preferred model
print("")
## [1] ""
print("RESISTANCE PRODUCTIVITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:")
## [1] "RESISTANCE PRODUCTIVITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:"
print("")
## [1] ""
summary(resist_prod2)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + protegens +
## Heat + CommRich:protegens + Heat:protegens + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.6542 -0.6111 -0.0368 0.4416 6.3451
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.3362 0.2342 1.436 0.15231
## CommRich2 -0.2076 0.3126 -0.664 0.50725
## CommRich3 -0.2355 0.3916 -0.601 0.54813
## CommRich4 -0.7324 0.5507 -1.330 0.18481
## protegensTRUE 0.2003 0.3413 0.587 0.55793
## Heat12 -1.9687 0.3816 -5.160 5.12e-07 ***
## Heat24 -6.8899 0.3268 -21.085 < 2e-16 ***
## Heat48 -8.2992 0.3319 -25.008 < 2e-16 ***
## CommRich2:protegensTRUE -0.4327 0.3214 -1.346 0.17950
## CommRich3:protegensTRUE -0.8674 0.3821 -2.270 0.02409 *
## CommRich4:protegensTRUE NA NA NA NA
## protegensTRUE:Heat12 1.0422 0.3747 2.782 0.00583 **
## protegensTRUE:Heat24 5.3346 0.3726 14.316 < 2e-16 ***
## protegensTRUE:Heat48 2.7307 0.3721 7.339 3.18e-12 ***
## CommRich2:Heat12 1.0861 0.4508 2.409 0.01672 *
## CommRich3:Heat12 1.3456 0.5027 2.676 0.00794 **
## CommRich4:Heat12 1.3825 0.7554 1.830 0.06845 .
## CommRich2:Heat24 1.7188 0.4169 4.123 5.13e-05 ***
## CommRich3:Heat24 1.6295 0.4966 3.281 0.00118 **
## CommRich4:Heat24 1.3954 0.7761 1.798 0.07341 .
## CommRich2:Heat48 0.2390 0.4244 0.563 0.57383
## CommRich3:Heat48 0.9618 0.4900 1.963 0.05080 .
## CommRich4:Heat48 0.1466 0.7768 0.189 0.85044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9752 on 244 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.911, Adjusted R-squared: 0.9033
## F-statistic: 118.9 on 21 and 244 DF, p-value: < 2.2e-16
plot(resist_prod2)
# clean up
rm (resist_prod0, resist_prod1, resist_prod2)
# plot Maddy's preferred model
plot(ggplot(effectSize %>%
filter(Last_Heat_Day == TRUE),
aes(x=Heat,
y=TotDen_plusEpsilon_mean,
colour=CommRich,
group=CommRich)) +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Resistance", colour="Inoculated\nRichness",
y="Effect size on Total Density+epsilon", x="Heat Duration (hrs)"))
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
# plot the preferred model
plot(ggplot(effectSize %>%
filter(Last_Heat_Day == TRUE),
aes(x=Heat,
y=TotDen_plusEpsilon_mean,
colour=CommRich,
group=CommRich)) +
facet_grid(~protegens) +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Resistance (protegens present?)", colour="Inoculated\nRichness",
y="Effect size on Total Density+epsilon", x="Heat Duration (hrs)"))
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 1 row containing non-finite outside the scale range (`stat_smooth()`).
## Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
# clean up
rm(resist_prod0, resist_prod1, resist_prod2)
## Warning in rm(resist_prod0, resist_prod1, resist_prod2): object 'resist_prod0'
## not found
## Warning in rm(resist_prod0, resist_prod1, resist_prod2): object 'resist_prod1'
## not found
## Warning in rm(resist_prod0, resist_prod1, resist_prod2): object 'resist_prod2'
## not found
During resistance, the presence of protegens has a strong positive effect on productivity. This makes sense because it can grow with heat. I expected to see a similar effect of putida but for some reason we see a negative effect?!? Meanwhile, the strongest effect overall is the negative effect of heat duration. The presence of protegens in the culture reverses this effect.
#####################
# log10(Total Density + epsilon) transformed data
#####################
recov_prod0 <- with(effectSize %>% filter(Recov_Day == 2) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + Heat + CommRich:Heat))
# Maddy's preferred model
print("RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE SIMPLEST MODEL:")
## [1] "RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE SIMPLEST MODEL:"
summary(recov_prod0)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + Heat + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.1325 -0.3775 0.1387 0.7810 6.2363
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.581462 0.443039 1.312 0.19057
## CommRich2 -0.184524 0.575892 -0.320 0.74892
## CommRich3 -0.040868 0.626551 -0.065 0.94805
## CommRich4 -0.009616 0.990665 -0.010 0.99226
## Heat12 0.122891 0.723479 0.170 0.86526
## Heat24 -0.728282 0.634742 -1.147 0.25232
## Heat48 -3.925273 0.643720 -6.098 4.04e-09 ***
## CommRich2:Heat12 -0.817257 0.896764 -0.911 0.36299
## CommRich3:Heat12 -0.550674 0.962455 -0.572 0.56773
## CommRich4:Heat12 -0.837585 1.446958 -0.579 0.56320
## CommRich2:Heat24 0.727617 0.823691 0.883 0.37789
## CommRich3:Heat24 0.582137 0.927842 0.627 0.53096
## CommRich4:Heat24 -0.061475 1.472904 -0.042 0.96674
## CommRich2:Heat48 2.613339 0.837097 3.122 0.00201 **
## CommRich3:Heat48 2.226866 0.925210 2.407 0.01681 *
## CommRich4:Heat48 3.732222 1.476796 2.527 0.01211 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.981 on 251 degrees of freedom
## Multiple R-squared: 0.2156, Adjusted R-squared: 0.1688
## F-statistic: 4.6 on 15 and 251 DF, p-value: 9.319e-08
recov_prod1 <- with(effectSize %>% filter(Recov_Day == 2) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + withstands_heat + Heat + CommRich:withstands_heat + Heat:withstands_heat + CommRich:Heat))
recov_prod2 <- with(effectSize %>% filter(Recov_Day == 2) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + protegens + Heat + CommRich:protegens + Heat:protegens + CommRich:Heat))
# compare the nested models
anova(recov_prod0, recov_prod1)
anova(recov_prod0, recov_prod2)
# compare all 3 models
AIC(recov_prod0, recov_prod1, recov_prod2) %>% arrange(AIC)
BIC(recov_prod0, recov_prod1, recov_prod2) %>% arrange(BIC)
# the statistically preferred model
print("")
## [1] ""
print("RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:")
## [1] "RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:"
print("")
## [1] ""
summary(recov_prod2)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + protegens +
## Heat + CommRich:protegens + Heat:protegens + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.1370 -0.3885 0.0210 0.4423 7.3777
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.49833 0.42967 1.160 0.2473
## CommRich2 0.04806 0.57282 0.084 0.9332
## CommRich3 -0.27699 0.71841 -0.386 0.7002
## CommRich4 -0.25901 1.01029 -0.256 0.7979
## protegensTRUE 0.33253 0.62486 0.532 0.5951
## Heat12 -0.07454 0.70010 -0.106 0.9153
## Heat24 -0.64074 0.59959 -1.069 0.2863
## Heat48 -4.98353 0.60893 -8.184 1.50e-14 ***
## CommRich2:protegensTRUE -0.62147 0.58909 -1.055 0.2925
## CommRich3:protegensTRUE 0.09315 0.70137 0.133 0.8945
## CommRich4:protegensTRUE NA NA NA NA
## protegensTRUE:Heat12 0.34083 0.68493 0.498 0.6192
## protegensTRUE:Heat24 -0.34927 0.68116 -0.513 0.6086
## protegensTRUE:Heat48 3.77648 0.68026 5.552 7.35e-08 ***
## CommRich2:Heat12 -0.79810 0.82589 -0.966 0.3348
## CommRich3:Heat12 -0.63912 0.92234 -0.693 0.4890
## CommRich4:Heat12 -0.98098 1.38585 -0.708 0.4797
## CommRich2:Heat24 0.80973 0.76371 1.060 0.2901
## CommRich3:Heat24 0.75273 0.91091 0.826 0.4094
## CommRich4:Heat24 0.20026 1.42376 0.141 0.8883
## CommRich2:Heat48 1.64424 0.77747 2.115 0.0355 *
## CommRich3:Heat48 0.45277 0.89891 0.504 0.6149
## CommRich4:Heat48 1.01400 1.42505 0.712 0.4774
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.79 on 245 degrees of freedom
## Multiple R-squared: 0.3751, Adjusted R-squared: 0.3215
## F-statistic: 7.002 on 21 and 245 DF, p-value: 9.044e-16
plot(recov_prod2)
# clean up
rm(recov_prod0, recov_prod1, recov_prod2)
# plot Maddy's preferred model
plot(ggplot(effectSize %>%
filter(Recov_Day == 2),
aes(x=Heat,
y=Total_density_mean,
colour=CommRich,
group=CommRich)) +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Recovery", colour="Inoculated\nRichness",
y="Effect size on Total density", x="Heat Duration (hrs)"))
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## log-10 transformation introduced infinite values.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 19 rows containing non-finite outside the scale range
## (`stat_smooth()`).
# plot a preferred model
plot(ggplot(effectSize %>%
filter(Recov_Day == 2),
aes(x=Heat,
y=Total_density_mean,
colour=CommRich,
group=CommRich)) +
facet_grid(~protegens) +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Recovery (protegens present?)", colour="Inoculated\nRichness",
y="Effect size on Total density", x="Heat Duration (hrs)"))
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## log-10 transformation introduced infinite values.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 19 rows containing non-finite outside the scale range
## (`stat_smooth()`).
# plot a preferred model
plot(ggplot(effectSize %>%
filter(Recov_Day == 2),
aes(x=Heat,
y=Total_density_mean,
colour=CommRich,
group=CommRich)) +
facet_grid(~withstands_heat) +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.05) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_y_log10() +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Recovery (withstands heat? AKA putida present?)", colour="Inoculated\nRichness",
y="Effect size on Total density", x="Heat Duration (hrs)"))
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## log-10 transformation introduced infinite values.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 19 rows containing non-finite outside the scale range
## (`stat_smooth()`).
#####################
# log10(Total Density + epsilon) transformed data
#####################
overrecov0 <- with(effectSize %>% filter(Recov_Day == 1) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + Heat + CommRich:Heat))
# Maddy's preferred model
print("DAY1 RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE SIMPLEST MODEL:")
## [1] "DAY1 RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE SIMPLEST MODEL:"
summary(overrecov0)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + Heat + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.6476 -0.4197 0.1399 0.8361 5.8029
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.32508 0.49827 0.652 0.51473
## CommRich2 -0.04399 0.64769 -0.068 0.94590
## CommRich3 -0.11658 0.70466 -0.165 0.86873
## CommRich4 0.37514 1.11417 0.337 0.73662
## Heat12 0.32437 0.81367 0.399 0.69049
## Heat24 -2.13849 0.71387 -2.996 0.00301 **
## Heat48 -4.04683 0.72397 -5.590 5.92e-08 ***
## CommRich2:Heat12 -0.21204 1.00856 -0.210 0.83365
## CommRich3:Heat12 0.13960 1.08244 0.129 0.89749
## CommRich4:Heat12 -0.13750 1.62734 -0.084 0.93273
## CommRich2:Heat24 2.64174 0.92638 2.852 0.00471 **
## CommRich3:Heat24 2.91732 1.04351 2.796 0.00558 **
## CommRich4:Heat24 2.58830 1.65652 1.562 0.11943
## CommRich2:Heat48 2.32394 0.94145 2.468 0.01424 *
## CommRich3:Heat48 2.30285 1.04055 2.213 0.02779 *
## CommRich4:Heat48 3.51404 1.66090 2.116 0.03535 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.228 on 251 degrees of freedom
## Multiple R-squared: 0.2675, Adjusted R-squared: 0.2238
## F-statistic: 6.112 on 15 and 251 DF, p-value: 6.684e-11
overrecov1 <- with(effectSize %>% filter(Recov_Day == 1) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + withstands_heat + Heat + CommRich:withstands_heat + Heat:withstands_heat + CommRich:Heat))
overrecov2 <- with(effectSize %>% filter(Recov_Day == 1) %>%
mutate(log_TotalDensity_plusE = log(TotDen_plusEpsilon_mean)),
lm(log_TotalDensity_plusE ~ CommRich + protegens + Heat + CommRich:protegens + Heat:protegens + CommRich:Heat))
# compare nested models
anova(overrecov0, overrecov1)
anova(overrecov0, overrecov2)
# which is the best one?
AIC(overrecov0, overrecov1, overrecov2) %>% arrange(AIC)
BIC(overrecov0, overrecov1, overrecov2) %>% arrange(BIC)
# the statistically preferred model
print("")
## [1] ""
print("DAY1 RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:")
## [1] "DAY1 RECOVERY PRODUCTIVITY DATA. SUMMARY OF THE LOWEST AIC & BIC MODEL:"
print("")
## [1] ""
summary(overrecov2)
##
## Call:
## lm(formula = log_TotalDensity_plusE ~ CommRich + protegens +
## Heat + CommRich:protegens + Heat:protegens + CommRich:Heat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.5159 -0.5866 -0.0241 0.5840 7.4312
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.04371 0.45215 0.097 0.923071
## CommRich2 0.30384 0.60280 0.504 0.614673
## CommRich3 -0.23961 0.75601 -0.317 0.751561
## CommRich4 -0.46898 1.06316 -0.441 0.659519
## protegensTRUE 1.12549 0.65755 1.712 0.088231 .
## Heat12 0.12323 0.73674 0.167 0.867304
## Heat24 -2.36496 0.63096 -3.748 0.000222 ***
## Heat48 -5.39372 0.64080 -8.417 3.25e-15 ***
## CommRich2:protegensTRUE -1.25398 0.61991 -2.023 0.044176 *
## CommRich3:protegensTRUE -0.58629 0.73807 -0.794 0.427758
## CommRich4:protegensTRUE NA NA NA NA
## protegensTRUE:Heat12 0.03255 0.72078 0.045 0.964017
## protegensTRUE:Heat24 0.80434 0.71680 1.122 0.262906
## protegensTRUE:Heat48 4.73628 0.71586 6.616 2.30e-10 ***
## CommRich2:Heat12 -0.02406 0.86911 -0.028 0.977937
## CommRich3:Heat12 0.29376 0.97061 0.303 0.762408
## CommRich4:Heat12 0.03110 1.45837 0.021 0.983005
## CommRich2:Heat24 2.46383 0.80368 3.066 0.002415 **
## CommRich3:Heat24 2.47336 0.95857 2.580 0.010456 *
## CommRich4:Heat24 2.01043 1.49827 1.342 0.180891
## CommRich2:Heat48 1.12326 0.81815 1.373 0.171033
## CommRich3:Heat48 0.09753 0.94595 0.103 0.917963
## CommRich4:Heat48 0.12466 1.49962 0.083 0.933817
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.884 on 245 degrees of freedom
## Multiple R-squared: 0.4891, Adjusted R-squared: 0.4453
## F-statistic: 11.17 on 21 and 245 DF, p-value: < 2.2e-16
plot(overrecov2)
# clean up
rm(overrecov0, overrecov1, overrecov2)
# plot Maddy's preferred model
plot(ggplot(effectSize %>% filter(Recov_Day == 1),
aes(x=Heat, y=Total_density_mean,
colour=CommRich, group=CommRich)) +
scale_y_log10() +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.1) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Day 1 of Recovery", colour="Inoculated\nRichness",
y="Effect size on Total density"))
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## log-10 transformation introduced infinite values.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 19 rows containing non-finite outside the scale range
## (`stat_smooth()`).
# plot the preferred model
plot(ggplot(effectSize %>% filter(Recov_Day == 1),
aes(x=Heat, y=Total_density_mean,
colour=CommRich, group=CommRich)) +
facet_grid(~protegens) +
scale_y_log10() +
geom_hline(yintercept = 1, colour="grey") +
geom_jitter(alpha=0.2, size=0.8, width=0.1) +
geom_line(stat="smooth", method=lm, alpha=0.9) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title="Day 1 of Recovery (protegens present?)", colour="Inoculated\nRichness",
y="Effect size on Total density"))
## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
## log-10 transformation introduced infinite values.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 19 rows containing non-finite outside the scale range
## (`stat_smooth()`).
# clean up
rm(effectSize)
One trend that I noticed as I scoured the data when I was trying to fit it to the gLV model is that the biomass seems systematically higher on Day 1 of recovery. …???
Now we repeat the same type of emmeans analysis as we did for diversity but using the total density (aka a proxy of productivity). In this case I am a priori more comfortable with using Poisson or negative binomial family because the total density is more like counts data.
Remember that total densities below the threshold of detection from wells that DID recover during the recovery phase (i.e., those that did not go extinct) have values of epsilon corresponding to the threshold of detection. (Remaining NA values represent missing data due to pipetting mistakes or clogs during flow cytometry.) Below threshold of detection total density values (i.e., epsilons) make up the majority of observations during resistance for the longest heat duration. See a further discussion in the section below.
# scale the data by its standard deviation
absDen_forFit$TotDensity_scale <- scale(absDen_forFit$Total_density,
scale = sd(absDen_forFit$Total_density, na.rm = TRUE),
center = FALSE)
# the max scaled value is ~7.9 and almost 3% of the data is 0 values
summary(absDen_forFit$TotDensity_scale)
## V1
## Min. :0.0000
## 1st Qu.:0.1041
## Median :0.2425
## Mean :0.6746
## 3rd Qu.:0.8045
## Max. :7.8953
## NA's :9
sum(absDen_forFit$TotDensity_scale == 0) / length(absDen_forFit$TotDensity_scale)
## [1] NA
# in fact, the total density data is even more long-tailed than the diversity data. I guess that makes sense as there is a max value for the possible diversity with 4 species.
hist(absDen_forFit$TotDensity_scale)
# re-arrange the levels so that emmeans can be run:
absDen_forFit$Heat <- as.character(absDen_forFit$Heat)
absDen_forFit$Heat[which(absDen_forFit$Heat == 0)] <- "control"
# !!! emmeans expects the control to be the very *last* level !!!
absDen_forFit$Heat <- factor(absDen_forFit$Heat,
levels = c("6", "12", "24", "48", "control"))
# let's keep CommRich and Day as numeric for now while we look for the best fitting GLM family
# let's compare different GLM families
try_gaussian <- glmmTMB(TotDensity_scale ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_gaussian, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.936 0.828 0.464 0.604 0.104 0.448 0.092 0.328 0.448 0.5 0.616 0.668 0.692 0.144 0.468 0.484 0.828 0.54 0.136 1 ...
try_gamma <- glmmTMB(TotDensity_scale ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = ziGamma,
ziformula = ~1, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
simulateResiduals(fittedModel = try_gamma, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.812 0.964 0.5 0.748 0.336 0.532 0.336 0.612 0.46 0.596 0.64 0.78 0.816 0.368 0.508 0.636 0.748 0.664 0.376 0.952 ...
try_lognorm <- glmmTMB(TotDensity_scale ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = lognormal,
ziformula = ~1, # this needs to be added because there are 0 values in the data
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = try_lognorm, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.836 0.876 0.356 0.612 0.26 0.384 0.176 0.484 0.32 0.436 0.668 0.596 0.708 0.164 0.372 0.472 0.796 0.468 0.272 0.964 ...
try_LOGlognorm <- glmmTMB(log(TotDensity_scale + 1) ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = lognormal,
ziformula = ~1, # I'm keeping this as 0-inflated lognormal alone was already over-dispersed. So I want to see if the log(x+1) transformation sufficiently brings in the long tail.
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = try_LOGlognorm, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.776 0.912 0.356 0.644 0.26 0.396 0.196 0.48 0.32 0.448 0.624 0.644 0.74 0.192 0.384 0.48 0.776 0.492 0.252 0.928 ...
try_negbinom <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = nbinom2,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_negbinom, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.776 0.944 0.536 0.784 0.404 0.612 0.456 0.548 0.5558496 0.6970012 0.6674832 0.84 0.848 0.444 0.54 0.6242595 0.748 0.6612216 0.408 0.892 ...
try_negbinom0 <- glmmTMB(as.integer(Total_density * 1000) ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = nbinom2,
ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_negbinom0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.752 0.952 0.508 0.76 0.344 0.568 0.364 0.516 0.504 0.676 0.644 0.876 0.872 0.384 0.564 0.64 0.732 0.736 0.404 0.956 ...
try_poisson <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_poisson, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.788 0.804 0.3376638 0.612 0.244 0.416 0.144 0.392 0.324256 0.496 0.592 0.668 0.6574457 0.188 0.384 0.436 0.756 0.508 0.2 0.924 ...
try_poisson0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Day*protegens,
data = absDen_forFit,
family = genpois,
ziformula = ~1, # try zero inflated distribution
control = glmmTMBControl(optCtrl = list(iter.max = 10000,eval.max = 10000)))
simulateResiduals(fittedModel = try_poisson0, plot = TRUE)
## DHARMa:testOutliers with type = binomial may have inflated Type I error rates for integer-valued distributions. To get a more exact result, it is recommended to re-run testOutliers with type = 'bootstrap'. See ?testOutliers for details
## qu = 0.25, log(sigma) = -2.001813 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.784 0.824 0.384 0.636 0.188 0.3772009 0.232 0.464 0.364 0.4753274 0.612 0.68 0.7 0.208 0.356 0.436 0.748 0.5187042 0.208 0.952 ...
# let's check this with AIC and BIC
AIC(try_gaussian, try_gamma, try_lognorm, try_LOGlognorm,
try_negbinom, try_negbinom0, try_poisson, try_poisson0) %>% arrange(AIC)
BIC(try_gaussian, try_gamma, try_lognorm, try_LOGlognorm,
try_negbinom, try_negbinom0, try_poisson, try_poisson0) %>% arrange(BIC)
# clean up
rm(try_gaussian, try_gamma, try_lognorm, try_LOGlognorm, try_negbinom, try_negbinom0, try_poisson, try_poisson0)
Okay, so let’s go for the Poisson family. Its residuals look a little worse than the log(x+1) transformed lognormal… But I feel really sketched out by the latter model. Whereas the Poisson is the type of family that I might expect to see for count-style data like the Total density.
First let’s analyze productivity with the whole dataset, including the replicates for 24h and 48h durations that were below the threshold of detection during resistance (NA values that have been replaced with \(\approx 0.086\)) and true 0’s that never recovered.
The resistance effect estimated for 24h duration should be treated with some skepticism as 12.1% of the data (8/66) is NA values, with all reps missing from P. grimontii monocultures, 2 reps missing from P. veronii monocultures, and 1 rep missing from P. putida monocultures.
And the resistance effect at 48h should be treated with extreme caution because 78.1% of the data (50/64 replicates) is NA values. There are no monocultures from either slow growing species, 1 monoculture from P. protegens, 1 monoculture from P. putida,
####################
# 6h heat duration
####################
# grab just the treatment with its associated control data
absDen_6h <- rbind(absDen_forFit %>% filter(Heat == "6"),
absDen_forFit %>% filter(Heat == "control", Day < 4))
# create a column for last day of heat, first day of recovery, and last day of recovery
absDen_6h$Trtmt_Day <- "resist"
absDen_6h$Trtmt_Day[absDen_6h$Day == 2] <- "recov_1"
absDen_6h$Trtmt_Day[absDen_6h$Day == 3] <- "recov_2"
# try changing CommRich to unordered factor
absDen_6h$CommRich <- factor(absDen_6h$CommRich, ordered = FALSE)
# save the data to storage for later
productivitySubsettedData <- list(h6 = absDen_6h)
# try dropping inoculated community richness = 4 because it's unitary (i.e., it get dropped by glmmTMB then eff_size complains that it can't do anything with the resultant NA estimate values)
absDen_6h <- absDen_6h %>% filter(CommRich != 4)
absDen_6h$CommRich <- droplevels(absDen_6h$CommRich)
productivity6h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_6h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity6h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.904 0.936 0.872 0.292 0.912 0.812 0.14 0.676 0.964 0.068 0.848 0.94 0.392 0.828 0.924 0.172 0.08 0.696 0.3384275 0.436 ...
productivity6h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_6h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity6h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.908 0.936 0.856 0.228 0.92 0.828 0.144 0.356 0.916 0.072 0.872 0.948 0.324 0.808 0.916 0.26 0.1184275 0.74 0.328 0.4246324 ...
productivity6h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_6h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity6h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.708 0.936 0.78 0.632 0.92 0.716 0.516 0.384 0.98 0.356 0.844 0.976 0.636 0.6 0.86 0.464 0.2264275 0.288 0.548 0.604 ...
# check preferred models
anova(productivity6h_H0, productivity6h_H1)
anova(productivity6h_H0, productivity6h_H2)
AIC(productivity6h_H0, productivity6h_H1, productivity6h_H2) %>% arrange(AIC)
BIC(productivity6h_H0, productivity6h_H1, productivity6h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity6h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_6h
##
## AIC BIC logLik deviance df.resid
## 5330.4 5420.1 -2642.2 5284.4 342
##
##
## Dispersion parameter for genpois family (): 387
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.429e+00 9.955e-02 74.63 <2e-16
## CommRich2 -2.879e-02 1.363e-01 -0.21 0.8327
## CommRich3 -5.411e-02 1.632e-01 -0.33 0.7403
## Heatcontrol -2.920e-01 1.748e-01 -1.67 0.0948
## Trtmt_Dayrecov_2 -3.034e-02 1.394e-01 -0.22 0.8277
## Trtmt_Dayresist 5.534e-02 1.371e-01 0.40 0.6866
## protegens -1.285e+00 1.312e-01 -9.79 <2e-16
## CommRich2:Heatcontrol 1.622e-01 2.229e-01 0.73 0.4667
## CommRich3:Heatcontrol 3.847e-01 2.590e-01 1.48 0.1375
## CommRich2:Trtmt_Dayrecov_2 -8.511e-02 1.898e-01 -0.45 0.6539
## CommRich3:Trtmt_Dayrecov_2 -2.132e-01 2.209e-01 -0.97 0.3345
## CommRich2:Trtmt_Dayresist 2.895e-02 1.861e-01 0.16 0.8764
## CommRich3:Trtmt_Dayresist -4.487e-02 2.151e-01 -0.21 0.8347
## Heatcontrol:Trtmt_Dayrecov_2 -1.716e-01 2.540e-01 -0.68 0.4994
## Heatcontrol:Trtmt_Dayresist 3.121e-02 2.440e-01 0.13 0.8982
## CommRich2:protegens -4.839e-02 1.498e-01 -0.32 0.7466
## CommRich3:protegens -8.439e-05 1.642e-01 0.00 0.9996
## Heatcontrol:protegens -4.988e-02 1.175e-01 -0.42 0.6713
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 8.645e-02 3.204e-01 0.27 0.7873
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 -1.871e-01 3.666e-01 -0.51 0.6097
## CommRich2:Heatcontrol:Trtmt_Dayresist 1.954e-01 3.074e-01 0.64 0.5251
## CommRich3:Heatcontrol:Trtmt_Dayresist 3.110e-01 3.447e-01 0.90 0.3670
##
## (Intercept) ***
## CommRich2
## CommRich3
## Heatcontrol .
## Trtmt_Dayrecov_2
## Trtmt_Dayresist
## protegens ***
## CommRich2:Heatcontrol
## CommRich3:Heatcontrol
## CommRich2:Trtmt_Dayrecov_2
## CommRich3:Trtmt_Dayrecov_2
## CommRich2:Trtmt_Dayresist
## CommRich3:Trtmt_Dayresist
## Heatcontrol:Trtmt_Dayrecov_2
## Heatcontrol:Trtmt_Dayresist
## CommRich2:protegens
## CommRich3:protegens
## Heatcontrol:protegens
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2
## CommRich2:Heatcontrol:Trtmt_Dayresist
## CommRich3:Heatcontrol:Trtmt_Dayresist
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity6h_H2$frame,
predict(productivity6h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
####################
# 12h heat duration
####################
# grab just the treatment with its associated control data
absDen_12h <- rbind(absDen_forFit %>% filter(Heat == "12", Day > 1),
absDen_forFit %>% filter(Heat == "control", Day > 1, Day !=5))
# create a column for last day of heat, first day of recovery, and last day of recovery
absDen_12h$Trtmt_Day <- "resist"
absDen_12h$Trtmt_Day[absDen_12h$Day == 3] <- "recov_1"
absDen_12h$Trtmt_Day[absDen_12h$Day == 4] <- "recov_2"
# change CommRich to unordered factor
absDen_12h$CommRich <- factor(absDen_12h$CommRich, ordered = FALSE)
# save the data to storage for later
productivitySubsettedData[["h12"]] <- absDen_12h
# drop inoculated community richness = 4 because it is unitary for models H1 and H2
absDen_12h <- absDen_12h %>% filter(CommRich != 4)
absDen_12h$CommRich <- droplevels(absDen_12h$CommRich)
productivity12h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_12h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity12h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.852 0.864 0.1684907 0.64 0.244 0.636 0.056 0.5 0.8580955 0.3452495 0.508 0.612 0.428 0.4589084 0.432 0.2569738 0.296 0.808 0.9 0.436 ...
productivity12h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_12h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity12h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.808 0.804 0.08849074 0.648 0.244 0.6420955 0.056 0.34 0.816 0.428 0.58 0.704 0.412 0.432 0.5132495 0.264 0.148 0.764 0.872 0.548 ...
productivity12h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_12h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity12h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.852 0.852 0.4124907 0.924 0.7100955 0.564 0.008 0.7892495 0.868 0.6229084 0.792 0.308 0.54 0.5724869 0.568 0.108 0.576 0.764 0.932 0.7300784 ...
# check preferred models
anova(productivity12h_H0, productivity12h_H1)
anova(productivity12h_H0, productivity12h_H2)
AIC(productivity12h_H0, productivity12h_H1, productivity12h_H2) %>% arrange(AIC)
BIC(productivity12h_H0, productivity12h_H1, productivity12h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity12h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_12h
##
## AIC BIC logLik deviance df.resid
## 4605.7 4692.8 -2279.8 4559.7 304
##
##
## Dispersion parameter for genpois family (): 409
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.52603 0.14895 50.53 < 2e-16 ***
## CommRich2 -0.47396 0.18456 -2.57 0.010227 *
## CommRich3 -0.24722 0.21884 -1.13 0.258594
## Heatcontrol -0.50106 0.20912 -2.40 0.016573 *
## Trtmt_Dayrecov_2 -0.09367 0.18856 -0.50 0.619372
## Trtmt_Dayresist -1.32241 0.23603 -5.60 2.11e-08 ***
## protegens -1.29444 0.15647 -8.27 < 2e-16 ***
## CommRich2:Heatcontrol 0.54418 0.25387 2.14 0.032068 *
## CommRich3:Heatcontrol 0.30799 0.28898 1.07 0.286514
## CommRich2:Trtmt_Dayrecov_2 -0.31421 0.24475 -1.28 0.199204
## CommRich3:Trtmt_Dayrecov_2 -0.03415 0.26765 -0.13 0.898477
## CommRich2:Trtmt_Dayresist 0.93594 0.28388 3.30 0.000977 ***
## CommRich3:Trtmt_Dayresist 1.30899 0.29900 4.38 1.20e-05 ***
## Heatcontrol:Trtmt_Dayrecov_2 0.08100 0.29181 0.28 0.781336
## Heatcontrol:Trtmt_Dayresist 1.47197 0.31450 4.68 2.86e-06 ***
## CommRich2:protegens 0.16611 0.17315 0.96 0.337387
## CommRich3:protegens -0.12256 0.18662 -0.66 0.511351
## Heatcontrol:protegens -0.27796 0.13002 -2.14 0.032528 *
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 0.45167 0.36462 1.24 0.215441
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 0.35112 0.40948 0.86 0.391186
## CommRich2:Heatcontrol:Trtmt_Dayresist -0.88906 0.38138 -2.33 0.019743 *
## CommRich3:Heatcontrol:Trtmt_Dayresist -0.79365 0.41422 -1.92 0.055362 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity12h_H2$frame,
predict(productivity12h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
####################
# 24h heat duration
####################
# grab just the treatment with its associated control data
absDen_24h <- rbind(absDen_forFit %>% filter(Heat == "24", Day > 1),
absDen_forFit %>% filter(Heat == "control", Day > 1, Day !=5))
# create a column for last day of heat, first day of recovery, and last day of recovery
absDen_24h$Trtmt_Day <- "resist"
absDen_24h$Trtmt_Day[absDen_24h$Day == 3] <- "recov_1"
absDen_24h$Trtmt_Day[absDen_24h$Day == 4] <- "recov_2"
# change CommRich to unordered factor
absDen_24h$CommRich <- factor(absDen_24h$CommRich, ordered = FALSE)
# save the data to storage for later
productivitySubsettedData[["h24"]] <- absDen_24h
# drop inoculated community richness = 4 because it is unitary for models H1 and H2
absDen_24h <- absDen_24h %>% filter(CommRich != 4)
absDen_24h$CommRich <- droplevels(absDen_24h$CommRich)
productivity24h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_24h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity24h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.3745807 0.2441765 0.7709235 0.2816641 0.876 0.3065692 0.8 0.6689584 0.07468735 0.09603567 0.864 0.872 0.2415482 0.44 0.4619565 0.0167054 0.652 0.636 0.1180389 0.260213 ...
productivity24h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_24h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity24h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.3345807 0.2182648 0.7549235 0.5344962 0.948 0.2665692 0.876 0.6369584 0.05401551 0.08300892 0.892 0.892 0.3778704 0.44 0.4619565 0.0167054 0.664 0.608 0.0872623 0.2354556 ...
productivity24h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_24h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity24h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.3545807 0.2352207 0.872 0.2956939 0.828 0.312416 0.736 0.808 0.06899226 0.08383367 0.904 0.936 0.2326718 0.5950089 0.584 0.012 0.736 0.7667741 0.1138695 0.2407054 ...
# check preferred models
anova(productivity24h_H0, productivity24h_H1)
anova(productivity24h_H0, productivity24h_H2)
AIC(productivity24h_H0, productivity24h_H1, productivity24h_H2) %>% arrange(AIC)
BIC(productivity24h_H0, productivity24h_H1, productivity24h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity24h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_24h
##
## AIC BIC logLik deviance df.resid
## 4809.4 4896.8 -2381.7 4763.4 308
##
##
## Dispersion parameter for genpois family (): 1.63e+03
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.8860 0.2427 20.134 < 2e-16 ***
## CommRich2 2.4011 0.2922 8.218 < 2e-16 ***
## CommRich3 2.2084 0.3953 5.587 2.31e-08 ***
## Heatcontrol 2.1096 0.3297 6.398 1.58e-10 ***
## Trtmt_Dayrecov_2 0.7906 0.3271 2.417 0.01566 *
## Trtmt_Dayresist 0.1864 0.3383 0.551 0.58172
## protegens 0.4419 0.2553 1.731 0.08341 .
## CommRich2:Heatcontrol -1.8017 0.3825 -4.710 2.48e-06 ***
## CommRich3:Heatcontrol -1.7739 0.4453 -3.983 6.80e-05 ***
## CommRich2:Trtmt_Dayrecov_2 -0.9452 0.3676 -2.571 0.01014 *
## CommRich3:Trtmt_Dayrecov_2 -1.0175 0.4064 -2.504 0.01228 *
## CommRich2:Trtmt_Dayresist -2.1974 0.4187 -5.248 1.53e-07 ***
## CommRich3:Trtmt_Dayresist -1.4758 0.4795 -3.078 0.00209 **
## Heatcontrol:Trtmt_Dayrecov_2 -0.7592 0.4280 -1.774 0.07607 .
## Heatcontrol:Trtmt_Dayresist 0.1803 0.4326 0.417 0.67689
## CommRich2:protegens -0.9424 0.2357 -3.999 6.37e-05 ***
## CommRich3:protegens -0.7968 0.2764 -2.883 0.00394 **
## Heatcontrol:protegens -1.0359 0.2215 -4.678 2.90e-06 ***
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 1.0078 0.4945 2.038 0.04154 *
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 1.2348 0.5495 2.247 0.02463 *
## CommRich2:Heatcontrol:Trtmt_Dayresist 1.9810 0.5248 3.775 0.00016 ***
## CommRich3:Heatcontrol:Trtmt_Dayresist 1.6277 0.5999 2.713 0.00666 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity24h_H2$frame,
predict(productivity24h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
####################
# 48h heat duration
####################
# grab just the treatment with its associated control data
absDen_48h <- rbind(absDen_forFit %>% filter(Heat == "48", Day > 2),
absDen_forFit %>% filter(Heat == "control", Day > 2))
# create a column for last day of heat, first day of recovery, and last day of recovery
absDen_48h$Trtmt_Day <- "resist"
absDen_48h$Trtmt_Day[absDen_48h$Day == 4] <- "recov_1"
absDen_48h$Trtmt_Day[absDen_48h$Day == 5] <- "recov_2"
# change CommRich to unordered factor
absDen_48h$CommRich <- factor(absDen_48h$CommRich, ordered = FALSE)
# save the data to storage for later
productivitySubsettedData[["h48"]] <- absDen_48h
# drop inoculated community richness = 4 because it is unitary for models H1 and H2
absDen_48h <- absDen_48h %>% filter(CommRich != 4)
absDen_48h$CommRich <- droplevels(absDen_48h$CommRich)
productivity48h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity48h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6617478 0.6786964 0.7920234 0.6443595 0.6095996 0.7943915 0.821849 0.5121696 0.5841971 0.1080135 0.1127676 0.3657588 0.07855458 0.08745714 0.4346818 0.4534248 0.3363609 0.7474522 0.976 0.968 ...
productivity48h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity48h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6119589 0.7385038 0.7220273 0.7158086 0.5733329 0.8386422 0.8441048 0.5580576 0.6025469 0.08154663 0.1339115 0.3313922 0.08569591 0.09087344 0.4596226 0.4248626 0.3683952 0.8076165 0.976 0.964 ...
productivity48h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity48h_H2, plot = TRUE)
## Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
## : Fitting terminated with step failure - check results carefully
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6119589 0.6105038 0.9900039 0.5409104 0.5645329 0.7626422 0.789849 0.5127808 0.5841971 0.1723924 0.08898069 0.3215732 0.07345363 0.08745714 0.4382448 0.4534248 0.3043265 0.709644 1 0.992 ...
# check preferred models
anova(productivity48h_H0, productivity48h_H1)
anova(productivity48h_H0, productivity48h_H2)
AIC(productivity48h_H0, productivity48h_H1, productivity48h_H2) %>% arrange(AIC)
BIC(productivity48h_H0, productivity48h_H1, productivity48h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity48h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_48h
##
## AIC BIC logLik deviance df.resid
## 3581.8 3666.8 -1767.9 3535.8 275
##
##
## Dispersion parameter for genpois family (): 731
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.92383 0.32237 9.070 < 2e-16 ***
## CommRich2 0.57010 0.35987 1.584 0.113
## CommRich3 0.08673 0.40646 0.213 0.831
## Heatcontrol 4.10173 0.35436 11.575 < 2e-16 ***
## Trtmt_Dayrecov_2 -0.16558 0.38471 -0.430 0.667
## Trtmt_Dayresist -2.70931 0.64238 -4.218 2.47e-05 ***
## protegens 2.87044 0.27136 10.578 < 2e-16 ***
## CommRich2:Heatcontrol -0.24532 0.39123 -0.627 0.531
## CommRich3:Heatcontrol 0.04008 0.42801 0.094 0.925
## CommRich2:Trtmt_Dayrecov_2 -0.04857 0.44409 -0.109 0.913
## CommRich3:Trtmt_Dayrecov_2 0.01517 0.47227 0.032 0.974
## CommRich2:Trtmt_Dayresist -0.59666 0.77140 -0.773 0.439
## CommRich3:Trtmt_Dayresist -0.29438 0.83485 -0.353 0.724
## Heatcontrol:Trtmt_Dayrecov_2 -0.42134 0.47287 -0.891 0.373
## Heatcontrol:Trtmt_Dayresist 2.69833 0.68802 3.922 8.79e-05 ***
## CommRich2:protegens -0.25056 0.23788 -1.053 0.292
## CommRich3:protegens 0.01468 0.28047 0.052 0.958
## Heatcontrol:protegens -3.96844 0.25545 -15.535 < 2e-16 ***
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 -0.10041 0.55259 -0.182 0.856
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 0.08979 0.59911 0.150 0.881
## CommRich2:Heatcontrol:Trtmt_Dayresist 0.49450 0.82673 0.598 0.550
## CommRich3:Heatcontrol:Trtmt_Dayresist 0.05933 0.90207 0.066 0.948
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity48h_H2$frame,
predict(productivity48h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
#######################
# effect sizes
#######################
# plot the effect size contingent on protegens
effect_6h_protegens <- eff_size(emmeans(productivity6h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_6h),
sigma(productivity6h_H2),
edf = df.residual(productivity6h_H2))
effect_12h_protegens <- eff_size(emmeans(productivity12h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_12h),
sigma(productivity12h_H2),
edf = df.residual(productivity12h_H2))
effect_24h_protegens <- eff_size(emmeans(productivity24h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_24h),
sigma(productivity24h_H2),
edf = df.residual(productivity24h_H2))
effect_48h_protegens <- eff_size(emmeans(productivity48h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_48h),
sigma(productivity48h_H2),
edf = df.residual(productivity48h_H2))
# a function that extracts the confidence intervals from eff_size contingent on protegens
get_effsize_CIs <- function(eff_size_object, heat_trtmt) {
data.frame(Heat = heat_trtmt,
CommRich = confint(eff_size_object)[[2]],
Trtmt_Day = confint(eff_size_object)[[3]],
protegens = confint(eff_size_object)[[4]],
effect_est = confint(eff_size_object)[[5]],
effect_loCI = confint(eff_size_object)[[8]],
effect_hiCI = confint(eff_size_object)[[9]])
}
# create a data.frame for plotting marginal effect sizes using a forest plot
productivity_protegens <- data.frame()
productivity_protegens <- rbind(productivity_protegens,
get_effsize_CIs(effect_6h_protegens, heat_trtmt = 6),
get_effsize_CIs(effect_12h_protegens, heat_trtmt = 12),
get_effsize_CIs(effect_24h_protegens, heat_trtmt = 24),
get_effsize_CIs(effect_48h_protegens, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
productivity_protegens$Trtmt_Day <- factor(productivity_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(productivity_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
ggplot(productivity_protegens,
aes(x = effect_est, y = CommRich, colour = Trtmt_Day, shape = as.logical(protegens))) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = effect_loCI, xmax = effect_hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
shape = "protegens\npresent?",
title = "(with extinct reps)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# only 48h has any noticable effect with protegens
# let's look at the post-hoc just for 48h with protegens
posthoc_48h_protegens <- emmeans(effect_48h_protegens,
pairwise ~ CommRich + Trtmt_Day + protegens,
data = absDen_48h)
print("Post-hoc for productivity at 48h (extinct replicates INCLUDED) conditional on protegens:")
## [1] "Post-hoc for productivity at 48h (extinct replicates INCLUDED) conditional on protegens:"
multcomp::cld(posthoc_48h_protegens, alpha=0.05/4, Letters = letters)
# okay, this makes sense. Only 48h of heat is long enough to depress total density even after 2 days of recovery
# But, in the presence of protegens, total density can bounce back. Meanwhile, in the absence of protegens, total density does not bounce back even after 2 days of recovery.
# But we are not interested in the details of protegens. Let's do the post-hoc again now averaging across the effects of protegens.
posthoc_6h <- emmeans(effect_6h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_48h)
## NOTE: Results may be misleading due to involvement in interactions
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
productivity_effects <- data.frame()
productivity_effects <- rbind(productivity_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24),
get_posthoc(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
productivity_effects$Trtmt_Day <- factor(productivity_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(productivity_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot
ggplot(productivity_effects,
aes(x = est, y = CommRich, colour = Trtmt_Day)) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-0.009, label=groups)) +
scale_x_continuous(breaks=c(-0.006, -0.003, 0), limits=c(-0.01, 0.003)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
title = "Averaged across protegens (with extinct reps)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# finally, do the t-tests
# estimate the sample sizes
temp <- productivity_effects # copy the effects to temp
productivity_effects <- rbind(temp %>% filter(Heat == 6, CommRich == 1) %>% mutate(n = estimate_n(absDen_6h, CommRich=1)),
temp %>% filter(Heat == 6, CommRich == 2) %>% mutate(n = estimate_n(absDen_6h, CommRich=2)),
temp %>% filter(Heat == 6, CommRich == 3) %>% mutate(n = estimate_n(absDen_6h, CommRich=3)),
temp %>% filter(Heat == 12, CommRich == 1) %>% mutate(n = estimate_n(absDen_12h, CommRich=1)),
temp %>% filter(Heat == 12, CommRich == 2) %>% mutate(n = estimate_n(absDen_12h, CommRich=2)),
temp %>% filter(Heat == 12, CommRich == 3) %>% mutate(n = estimate_n(absDen_12h, CommRich=3)),
temp %>% filter(Heat == 24, CommRich == 1) %>% mutate(n = estimate_n(absDen_24h, CommRich=1)),
temp %>% filter(Heat == 24, CommRich == 2) %>% mutate(n = estimate_n(absDen_24h, CommRich=2)),
temp %>% filter(Heat == 24, CommRich == 3) %>% mutate(n = estimate_n(absDen_24h, CommRich=3)),
temp %>% filter(Heat == 48, CommRich == 1) %>% mutate(n = estimate_n(absDen_48h, CommRich=1)),
temp %>% filter(Heat == 48, CommRich == 2) %>% mutate(n = estimate_n(absDen_48h, CommRich=2)),
temp %>% filter(Heat == 48, CommRich == 3) %>% mutate(n = estimate_n(absDen_48h, CommRich=3)))
rm(temp)
# estimate the SD from the SE
productivity_effects <- productivity_effects %>% mutate(SD = SE * sqrt(n)) %>%
# re-order by Heat, Trtmt_Day, and CommRich
arrange(Heat, Trtmt_Day, CommRich)
# all pairwise combinations of comparisons between the same treatment day for different durations
temp <- t(combn(c(1,10,19,28), 2))
combos <- rbind(temp, temp+1, temp+2, temp+3, temp+4, temp+5, temp+6, temp+7, temp+8)
rm(temp)
# loop through all the combinations and do the t-tests
prodEffects_ttests <- data.frame()
for(i in 1:nrow(combos)){
prodEffects_ttests <- rbind(prodEffects_ttests,
run_ttest(row_x = combos[i,1],
row_y = combos[i,2],
summary_stats_df=productivity_effects))
}
prodEffects_ttests$adjusted_p <- p.adjust(prodEffects_ttests$pvalue, method = "bonferroni")
prodEffects_ttests$Trtmt_Day <- productivity_effects$Trtmt_Day[combos[,1]]
prodEffects_ttests$Heat_1 <- productivity_effects$Heat[combos[,1]]
prodEffects_ttests$Heat_2 <- productivity_effects$Heat[combos[,2]]
prodEffects_ttests$CommRich_1 <- productivity_effects$CommRich[combos[,1]]
prodEffects_ttests$CommRich_2 <- productivity_effects$CommRich[combos[,2]]
print(prodEffects_ttests)
## t_statistic df pvalue adjusted_p Trtmt_Day Heat_1
## t -0.2877406 3.988398 7.878773e-01 1.000000e+00 Recovery (Day 2) 6
## t1 7.7079695 5.195711 4.927845e-04 2.661036e-02 Recovery (Day 2) 6
## t2 11.4432675 7.373288 5.872224e-06 3.171001e-04 Recovery (Day 2) 6
## t3 5.9640547 2.310562 1.897938e-02 1.000000e+00 Recovery (Day 2) 12
## t4 9.7066144 4.142219 5.304329e-04 2.864338e-02 Recovery (Day 2) 12
## t5 7.4350441 4.436533 1.150994e-03 6.215365e-02 Recovery (Day 2) 24
## t6 8.3047396 11.752375 2.944540e-06 1.590051e-04 Recovery (Day 2) 6
## t7 4.4425342 7.848096 2.264897e-03 1.223044e-01 Recovery (Day 2) 6
## t8 13.1412778 11.748171 2.213809e-08 1.195457e-06 Recovery (Day 2) 6
## t9 -6.5828142 6.958951 3.171070e-04 1.712378e-02 Recovery (Day 2) 12
## t10 3.9177953 11.058097 2.378321e-03 1.284293e-01 Recovery (Day 2) 12
## t11 13.1872385 7.070111 3.094706e-06 1.671141e-04 Recovery (Day 2) 24
## t12 3.1026815 7.734006 1.522879e-02 8.223544e-01 Recovery (Day 2) 6
## t13 3.8452345 5.150237 1.139705e-02 6.154405e-01 Recovery (Day 2) 6
## t14 10.7605267 6.994536 1.324381e-05 7.151657e-04 Recovery (Day 2) 6
## t15 -0.2632805 4.812612 8.032327e-01 1.000000e+00 Recovery (Day 2) 12
## t16 7.4071575 6.748006 1.777538e-04 9.598703e-03 Recovery (Day 2) 12
## t17 10.5599359 4.109733 3.946012e-04 2.130846e-02 Recovery (Day 2) 24
## t18 -2.0833373 4.013176 1.053951e-01 1.000000e+00 Recovery (Day 1) 6
## t19 7.8510723 5.613125 3.126431e-04 1.688273e-02 Recovery (Day 1) 6
## t20 12.3046811 7.394212 3.440229e-06 1.857724e-04 Recovery (Day 1) 6
## t21 8.2597661 2.437015 7.715255e-03 4.166238e-01 Recovery (Day 1) 12
## t22 12.2057915 4.126727 2.156886e-04 1.164718e-02 Recovery (Day 1) 12
## t23 8.0231035 4.805003 5.846560e-04 3.157143e-02 Recovery (Day 1) 24
## t24 0.8528785 11.857990 4.106231e-01 1.000000e+00 Recovery (Day 1) 6
## t25 1.9528816 7.487006 8.908634e-02 1.000000e+00 Recovery (Day 1) 6
## t26 16.3228270 11.749652 1.958246e-09 1.057453e-07 Recovery (Day 1) 6
## t27 0.7161796 6.750765 4.978989e-01 1.000000e+00 Recovery (Day 1) 12
## t28 14.8540977 11.131046 1.095589e-08 5.916181e-07 Recovery (Day 1) 12
## t29 20.4939275 6.730476 2.570251e-07 1.387935e-05 Recovery (Day 1) 24
## t30 -3.1969557 7.640380 1.346082e-02 7.268845e-01 Recovery (Day 1) 6
## t31 -1.2724400 5.239119 2.567646e-01 1.000000e+00 Recovery (Day 1) 6
## t32 9.5855371 6.920780 3.040646e-05 1.641949e-03 Recovery (Day 1) 6
## t33 2.8801626 4.750856 3.671164e-02 1.000000e+00 Recovery (Day 1) 12
## t34 12.3922625 6.748861 6.856515e-06 3.702518e-04 Recovery (Day 1) 12
## t35 14.0271885 4.037086 1.413720e-04 7.634086e-03 Recovery (Day 1) 24
## t36 6.9210843 3.441254 3.910995e-03 2.111937e-01 Resistance 6
## t37 8.1327416 5.579291 2.680648e-04 1.447550e-02 Resistance 6
## t38 15.8302033 5.137819 1.476562e-05 7.973437e-04 Resistance 6
## t39 -2.6506331 2.296683 1.019491e-01 1.000000e+00 Resistance 12
## t40 8.4368564 5.453709 2.487411e-04 1.343202e-02 Resistance 12
## t41 12.9225790 3.852515 2.583195e-04 1.394925e-02 Resistance 24
## t42 5.0893220 11.322217 3.196226e-04 1.725962e-02 Resistance 6
## t43 7.4234676 8.611032 5.066805e-05 2.736075e-03 Resistance 6
## t44 23.2185298 7.917187 1.445257e-08 7.804386e-07 Resistance 6
## t45 0.0475035 7.234063 9.633969e-01 1.000000e+00 Resistance 12
## t46 18.8862967 8.907583 1.714752e-08 9.259660e-07 Resistance 12
## t47 21.3754670 5.991932 6.935617e-07 3.745233e-05 Resistance 24
## t48 -0.7525000 7.669884 4.742090e-01 1.000000e+00 Resistance 6
## t49 -0.2769900 5.602401 7.917215e-01 1.000000e+00 Resistance 6
## t50 12.8583864 4.371681 1.223023e-04 6.604323e-03 Resistance 6
## t51 0.6787958 5.105163 5.268152e-01 1.000000e+00 Resistance 12
## t52 13.1522403 4.549742 8.574260e-05 4.630100e-03 Resistance 12
## t53 14.0506071 3.313104 4.561890e-04 2.463421e-02 Resistance 24
## Heat_2 CommRich_1 CommRich_2
## t 12 1 1
## t1 24 1 1
## t2 48 1 1
## t3 24 1 1
## t4 48 1 1
## t5 48 1 1
## t6 12 2 2
## t7 24 2 2
## t8 48 2 2
## t9 24 2 2
## t10 48 2 2
## t11 48 2 2
## t12 12 3 3
## t13 24 3 3
## t14 48 3 3
## t15 24 3 3
## t16 48 3 3
## t17 48 3 3
## t18 12 1 1
## t19 24 1 1
## t20 48 1 1
## t21 24 1 1
## t22 48 1 1
## t23 48 1 1
## t24 12 2 2
## t25 24 2 2
## t26 48 2 2
## t27 24 2 2
## t28 48 2 2
## t29 48 2 2
## t30 12 3 3
## t31 24 3 3
## t32 48 3 3
## t33 24 3 3
## t34 48 3 3
## t35 48 3 3
## t36 12 1 1
## t37 24 1 1
## t38 48 1 1
## t39 24 1 1
## t40 48 1 1
## t41 48 1 1
## t42 12 2 2
## t43 24 2 2
## t44 48 2 2
## t45 24 2 2
## t46 48 2 2
## t47 48 2 2
## t48 12 3 3
## t49 24 3 3
## t50 48 3 3
## t51 24 3 3
## t52 48 3 3
## t53 48 3 3
# plot again without the group labels
ggplot(productivity_effects,
aes(x = est, y = CommRich, colour = Trtmt_Day)) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_x_continuous(breaks=c(-0.006, -0.003, 0), limits=c(-0.01, 0.003)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
title = "Averaged across protegens (with extinct reps)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
rm(productivity6h_H0, productivity6h_H1, productivity6h_H2,
productivity12h_H0, productivity12h_H1, productivity12h_H2,
productivity24h_H0, productivity24h_H1, productivity24h_H2,
productivity48h_H0, productivity48h_H1, productivity48h_H2,
effect_6h_protegens, effect_12h_protegens, effect_24h_protegens, effect_48h_protegens,
posthoc_6h, posthoc_12h, posthoc_24h, posthoc_48h, posthoc_48h_protegens,
productivity_protegens, prodEffects_ttests)
A main problem of our analysis is that I can’t do post-hoc to get
statistical significance between different heat durations (i.e., because
the effect sizes come from different data sub-sets). Here I have
followed Maddy’s suggestion that I do all pairwise combinations of
two-tailed t-tests to compare between models. To do it more elegantly,
maybe I can re-do the models in the bayesian way using the package
brms???
When we plot the effect sizes contingent on protegens, we see that there’s little interaction between protegens and heat, until the very longest heat duration. On the other hand, the presense of protegens doesn’t change the presence or extent of decoupling. This is why a posthoc analysis was used to average over the effect of protegens.
The main thing we see is that total densities are always lowest during resistance as compared to recovery days (which makes sense because there is heat). When we compare within resistance or within recovery between different heat durations, we find that shorter heat durations (6h VS 12h) tend to not be different from each other and long heat durations (24h VS 48h) tend to not be different from each other. But 6h and 48h are always different from each other.
I’m not really convinced that anything in particular is happening in terms of the biodiversity effect…
Let’s look at decoupling:
Given the thermal performance curves of the 4 species, we would expect to see a correlation between resistance and recovery for both the slow growing species (i.e., low resistance during heat and slow recovery post heat) and for the fast growing species (i.e., high resistance during heat and fast recovery post heat). I’m not sure what intuition I would have a priori about the effect of diversity on decoupling… But this is what Maddy is interested in.
# a function that takes the multcomp::cld letters from 2 groups and returns TRUE when no letters are shared (or FALSE when any letter is shared)
are_groups_different <- function(group1, group2) {
# convert the groups columns into TRUE/FALSE columns indicating significant difference between resistance and recovery effect sizes
first_group <- group1 %>%
# remove any white space
str_trim() %>%
# split the string up into single characters
strsplit(x=., split = character(0))
second_group <- group2 %>%
# remove any white space
str_trim() %>%
# split the string up into single letters
strsplit(x=., split = character(0))
# test if any letters are common. If there are, then they are NOT different so return FALSE (and vice versa).
return( !any(first_group[[1]] %in% second_group[[1]]) )
}
# a function to calculate distance from the point (x, y) to the line y = x: positive values are ABOVE the line and negative values are BELOW the line.
# this is used to calculate decoupling
dist_to_xyline <- function(x, y) {
(y - x) / sqrt(2) # distance formula derived from y = x line
}
# a function to estimate mean decoupling and its confidence intervals given mean and SYMMETRIC confidence intervals for resistance and recovery.
# Note that I can use the univariate confidence intervals only by assuming there's no correlation between resistance and recovery (which is exactly the opposite of the whole point of coupling)
# ...also, beware the the CI's come from a posthoc so they are more conservative that the real CI's...
estimate_decoupling <- function(resist_est, resist_hiCI,
recov_est, recov_hiCI) {
# check the input values
if(resist_hiCI < resist_est)
stop("`resist_hiCI` must be the *UPPER* confidence interval on resistance.")
if(recov_hiCI < recov_est)
stop("`recov_hiCI` must be the *UPPER* confidence interval on recovery.")
# get the co-ordinates that define the ellipse
x0 <- resist_est # x-coordinate of the center of the ellipse
y0 <- recov_est # y-coordinate of the center of the ellipse
a <- resist_hiCI - resist_est # semi-major axis: horizontal radius
b <- recov_hiCI - recov_est # semi-major axis: vertical radius
# generate points on the perimeter of the ellipse
theta <- seq(0, 2 * pi, length.out = 360) # angles
x_ellipse <- x0 + a * cos(theta) # x-coordinates on the ellipse
y_ellipse <- y0 + b * sin(theta) # y-coordinates on the ellipse
# decoupling measures the distance between the point and the y=x line
mean <- dist_to_xyline(x0, y0)
# do the same for all points on the ellipse defining the CI
distances <- dist_to_xyline(x_ellipse, y_ellipse)
# maximum and minimum distances define the hiCI and loCI, respectively
hiCI <- max(distances)
loCI <- min(distances)
return(c(est_decoupling = mean, loCI_decoupling = loCI, hiCI_decoupling = hiCI))
}
# positive values are ABOVE the y=x line and negative values are BELOW the y=x line
# rename the levels of Trtmt_Day
decoupling_productivity <- productivity_effects
levels(decoupling_productivity$Trtmt_Day) <- c("late_recov", "early_recov", "resist")
# create data.frame for plotting
decoupling_productivity <- decoupling_productivity %>%
select(-n, -SD) %>%
pivot_wider(names_from = Trtmt_Day,
values_from = c(est, loCI, hiCI, SE, groups))
# columns that indicate if resistance is significantly different from recovery
decoupling_productivity$early_recov_VS_resist <- mapply(are_groups_different,
decoupling_productivity$groups_early_recov,
decoupling_productivity$groups_resist)
decoupling_productivity$late_recov_VS_resist <- mapply(are_groups_different,
decoupling_productivity$groups_late_recov,
decoupling_productivity$groups_resist)
# clean up extra columns
decoupling_productivity <- decoupling_productivity %>% select(-groups_resist, -groups_early_recov, -groups_late_recov)
# first plot the decoupling on early recovery
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_early_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(early_recov_VS_resist))) +
geom_errorbarh(aes(xmin = loCI_resist, xmax = hiCI_resist), height=0) +
geom_errorbar(aes(ymin = loCI_early_recov, ymax = hiCI_early_recov), width=0) +
# in case we want to center the plot on 0,0
#scale_x_continuous(limits = c(-0.01, 0.01)) +
#scale_y_continuous(limits = c(-0.004, 0.004))
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
labs(title = "Decoupling of productivity (with extinct reps)",
x = "Resistance +/- 95% CI",
y = "Early Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
# here's another way to plot it where the confidence intervals are shown as ellipses:
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_early_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(early_recov_VS_resist))) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
geom_ellipse(aes(x0 = est_resist,
y0 = est_early_recov,
# radius on x direction:
a = hiCI_resist - est_resist,
# radius on y direction:
b = hiCI_early_recov - est_early_recov,
angle = 0)) +
labs(title = "Decoupling of productivity (with extinct reps)",
x = "Resistance +/- 95% CI",
y = "Early Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
# next plot the decoupling on later recovery
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_late_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(late_recov_VS_resist))) +
geom_errorbarh(aes(xmin = loCI_resist, xmax = hiCI_resist), height=0) +
geom_errorbar(aes(ymin = loCI_late_recov, ymax = hiCI_late_recov), width=0) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
labs(title = "Decoupling of productivity (with extinct reps)",
x = "Resistance +/- 95% CI",
y = "Late Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
# finally estimate decoupling by getting the distance to the y=x line
# calculate decoupling between resistance and early recovery
early_decoupling <- t(with(decoupling_productivity,
mapply(estimate_decoupling,
resist_est = est_resist,
resist_hiCI = hiCI_resist,
recov_est = est_early_recov,
recov_hiCI = hiCI_early_recov)))
# add annotation
early_decoupling <- cbind(decoupling_productivity[,1:2],
early_decoupling)
ggplot(early_decoupling,
aes(x = as.factor(Heat), y = est_decoupling, colour = as.factor(CommRich))) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title = "Early recovery (WITH extinct reps)",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)")
# calculate decoupling between resistance and late recovery
late_decoupling <- t(with(decoupling_productivity,
mapply(estimate_decoupling,
resist_est = est_resist,
resist_hiCI = hiCI_resist,
recov_est = est_late_recov,
recov_hiCI = hiCI_late_recov)))
# add annotation
late_decoupling <- cbind(decoupling_productivity[,1:2],
late_decoupling)
ggplot(late_decoupling,
aes(x = as.factor(Heat), y = est_decoupling, colour = as.factor(CommRich))) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title = "Late recovery (WITH extinct reps)",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)")
rm(productivity_effects, decoupling_productivity, early_decoupling, late_decoupling)
Let’s try the analysis again but this time altogether dropping from the analysis any communities where all replicates went extinct at 48h (i.e., this means that we drop the slowing growing monocultures,0_0_0_1 & 0_0_1_0, and the pair of slow growers, 0_0_1_1, from all heat treatments). We will also drop all time points for any replicates that went extinct.
# for 24h: summarize the total number of reps and the fraction missing during resistance & recovery
absDensity %>% filter(Day %in% 1:3, Heat == 24, !is.na(Total_density), !is.na(CommRich), Total_density > 0) %>% group_by(community, CommRich, Day) %>% summarise(reps = n()) %>% pivot_wider(names_from = Day, values_from = reps, names_prefix = "reps_") %>% mutate(resist_num_n = ifelse(is.na(reps_2),0,reps_2), recov_num_n = ifelse(is.na(reps_3),0,reps_3),.keep="unused")%>% mutate(resist_24h_frac_reps = resist_num_n/reps_1, recov_24h_frac_reps=recov_num_n/reps_1) %>% select(-resist_num_n, -recov_num_n) %>% rename(num_reps = reps_1) %>% arrange(CommRich)
## `summarise()` has grouped output by 'community', 'CommRich'. You can override
## using the `.groups` argument.
# for 48h: summarize the total number of reps and the fraction missing during resistance & recovery
absDensity %>% filter(Day %in% c(1,3:4), Heat == 48, !is.na(Total_density), !is.na(CommRich), Total_density > 0) %>% group_by(community, CommRich, Day) %>% summarise(reps = n()) %>% pivot_wider(names_from = Day, values_from = reps, names_prefix = "reps_") %>% mutate(resist_num_n = ifelse(is.na(reps_3),0,reps_3), recov_num_n = ifelse(is.na(reps_4),0,reps_4),.keep="unused")%>% mutate(resist_48h_frac_reps = resist_num_n/reps_1, recov_48h_frac_reps=recov_num_n/reps_1) %>% select(-resist_num_n, -recov_num_n) %>% rename(num_reps = reps_1) %>% arrange(CommRich)
## `summarise()` has grouped output by 'community', 'CommRich'. You can override
## using the `.groups` argument.
# remove the data with extinction
survived_uniqIDs <- extinct.df$uniqID[!is.na(extinct.df$survived) & extinct.df$survived == 1]
####################
# 6h heat duration EXCLUDING EXTINCT
####################
# exclude any replicates that experienced extinction
absDen_6h <- absDen_6h %>% filter(uniqID %in% survived_uniqIDs)
productivity6h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_6h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity6h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.904 0.936 0.872 0.292 0.912 0.812 0.14 0.676 0.964 0.068 0.848 0.94 0.392 0.828 0.924 0.172 0.08 0.696 0.3384275 0.436 ...
productivity6h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_6h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity6h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.908 0.936 0.856 0.228 0.92 0.828 0.144 0.356 0.916 0.072 0.872 0.948 0.324 0.808 0.916 0.26 0.1184275 0.74 0.328 0.4246324 ...
productivity6h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_6h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity6h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.708 0.936 0.78 0.632 0.92 0.716 0.516 0.384 0.98 0.356 0.844 0.976 0.636 0.6 0.86 0.464 0.2264275 0.288 0.548 0.604 ...
# check preferred models
anova(productivity6h_H0, productivity6h_H1)
anova(productivity6h_H0, productivity6h_H2)
AIC(productivity6h_H0, productivity6h_H1, productivity6h_H2) %>% arrange(AIC)
BIC(productivity6h_H0, productivity6h_H1, productivity6h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity6h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_6h
##
## AIC BIC logLik deviance df.resid
## 5330.4 5420.1 -2642.2 5284.4 342
##
##
## Dispersion parameter for genpois family (): 387
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.429e+00 9.955e-02 74.63 <2e-16
## CommRich2 -2.879e-02 1.363e-01 -0.21 0.8327
## CommRich3 -5.411e-02 1.632e-01 -0.33 0.7403
## Heatcontrol -2.920e-01 1.748e-01 -1.67 0.0948
## Trtmt_Dayrecov_2 -3.034e-02 1.394e-01 -0.22 0.8277
## Trtmt_Dayresist 5.534e-02 1.371e-01 0.40 0.6866
## protegens -1.285e+00 1.312e-01 -9.79 <2e-16
## CommRich2:Heatcontrol 1.622e-01 2.229e-01 0.73 0.4667
## CommRich3:Heatcontrol 3.847e-01 2.590e-01 1.48 0.1375
## CommRich2:Trtmt_Dayrecov_2 -8.511e-02 1.898e-01 -0.45 0.6539
## CommRich3:Trtmt_Dayrecov_2 -2.132e-01 2.209e-01 -0.97 0.3345
## CommRich2:Trtmt_Dayresist 2.895e-02 1.861e-01 0.16 0.8764
## CommRich3:Trtmt_Dayresist -4.487e-02 2.151e-01 -0.21 0.8347
## Heatcontrol:Trtmt_Dayrecov_2 -1.716e-01 2.540e-01 -0.68 0.4994
## Heatcontrol:Trtmt_Dayresist 3.121e-02 2.440e-01 0.13 0.8982
## CommRich2:protegens -4.839e-02 1.498e-01 -0.32 0.7466
## CommRich3:protegens -8.439e-05 1.642e-01 0.00 0.9996
## Heatcontrol:protegens -4.988e-02 1.175e-01 -0.42 0.6713
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 8.645e-02 3.204e-01 0.27 0.7873
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 -1.871e-01 3.666e-01 -0.51 0.6097
## CommRich2:Heatcontrol:Trtmt_Dayresist 1.954e-01 3.074e-01 0.64 0.5251
## CommRich3:Heatcontrol:Trtmt_Dayresist 3.110e-01 3.447e-01 0.90 0.3670
##
## (Intercept) ***
## CommRich2
## CommRich3
## Heatcontrol .
## Trtmt_Dayrecov_2
## Trtmt_Dayresist
## protegens ***
## CommRich2:Heatcontrol
## CommRich3:Heatcontrol
## CommRich2:Trtmt_Dayrecov_2
## CommRich3:Trtmt_Dayrecov_2
## CommRich2:Trtmt_Dayresist
## CommRich3:Trtmt_Dayresist
## Heatcontrol:Trtmt_Dayrecov_2
## Heatcontrol:Trtmt_Dayresist
## CommRich2:protegens
## CommRich3:protegens
## Heatcontrol:protegens
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2
## CommRich2:Heatcontrol:Trtmt_Dayresist
## CommRich3:Heatcontrol:Trtmt_Dayresist
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity6h_H2$frame,
predict(productivity6h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
####################
# 12h heat duration EXCLUDING EXTINCT
####################
# exclude any replicates that experienced extinction
absDen_12h <- absDen_12h %>% filter(uniqID %in% survived_uniqIDs)
productivity12h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_12h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity12h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.852 0.864 0.1684907 0.64 0.244 0.636 0.056 0.5 0.8580955 0.3452495 0.508 0.612 0.428 0.4589084 0.432 0.2569738 0.296 0.808 0.9 0.436 ...
productivity12h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_12h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity12h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.808 0.804 0.08849074 0.648 0.244 0.6420955 0.056 0.34 0.816 0.428 0.58 0.704 0.412 0.432 0.5132495 0.264 0.148 0.764 0.872 0.548 ...
productivity12h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_12h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity12h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.852 0.852 0.4124907 0.924 0.7100955 0.564 0.008 0.7892495 0.868 0.6229084 0.792 0.308 0.54 0.5724869 0.568 0.108 0.576 0.764 0.932 0.7300784 ...
# check preferred models
anova(productivity12h_H0, productivity12h_H1)
anova(productivity12h_H0, productivity12h_H2)
AIC(productivity12h_H0, productivity12h_H1, productivity12h_H2) %>% arrange(AIC)
BIC(productivity12h_H0, productivity12h_H1, productivity12h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity12h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_12h
##
## AIC BIC logLik deviance df.resid
## 4605.7 4692.8 -2279.8 4559.7 304
##
##
## Dispersion parameter for genpois family (): 409
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.52603 0.14895 50.53 < 2e-16 ***
## CommRich2 -0.47396 0.18456 -2.57 0.010227 *
## CommRich3 -0.24722 0.21884 -1.13 0.258594
## Heatcontrol -0.50106 0.20912 -2.40 0.016573 *
## Trtmt_Dayrecov_2 -0.09367 0.18856 -0.50 0.619372
## Trtmt_Dayresist -1.32241 0.23603 -5.60 2.11e-08 ***
## protegens -1.29444 0.15647 -8.27 < 2e-16 ***
## CommRich2:Heatcontrol 0.54418 0.25387 2.14 0.032068 *
## CommRich3:Heatcontrol 0.30799 0.28898 1.07 0.286514
## CommRich2:Trtmt_Dayrecov_2 -0.31421 0.24475 -1.28 0.199204
## CommRich3:Trtmt_Dayrecov_2 -0.03415 0.26765 -0.13 0.898477
## CommRich2:Trtmt_Dayresist 0.93594 0.28388 3.30 0.000977 ***
## CommRich3:Trtmt_Dayresist 1.30899 0.29900 4.38 1.20e-05 ***
## Heatcontrol:Trtmt_Dayrecov_2 0.08100 0.29181 0.28 0.781336
## Heatcontrol:Trtmt_Dayresist 1.47197 0.31450 4.68 2.86e-06 ***
## CommRich2:protegens 0.16611 0.17315 0.96 0.337387
## CommRich3:protegens -0.12256 0.18662 -0.66 0.511351
## Heatcontrol:protegens -0.27796 0.13002 -2.14 0.032528 *
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 0.45167 0.36462 1.24 0.215441
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 0.35112 0.40948 0.86 0.391186
## CommRich2:Heatcontrol:Trtmt_Dayresist -0.88906 0.38138 -2.33 0.019743 *
## CommRich3:Heatcontrol:Trtmt_Dayresist -0.79365 0.41422 -1.92 0.055362 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity12h_H2$frame,
predict(productivity12h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
####################
# 24h heat duration EXCLUDING EXTINCT
####################
# exclude any replicates that experienced extinction
absDen_24h <- absDen_24h %>% filter(uniqID %in% survived_uniqIDs)
productivity24h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_24h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity24h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.2934722 0.1906685 0.704 0.2397486 0.876 0.3425418 0.7804869 0.668 0.1081567 0.08518922 0.8446709 0.8774346 0.1973584 0.4517649 0.424 0.02534851 0.612 0.5877744 0.1470979 0.2609041 ...
productivity24h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_24h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity24h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.2544537 0.164573 0.7 0.5607468 0.948 0.2836335 0.916 0.6529738 0.09631347 0.07279281 0.876 0.92 0.3173418 0.4548691 0.424 0.02469801 0.616 0.5597649 0.1453485 0.2473231 ...
productivity24h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_24h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity24h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.2744537 0.1746685 0.908 0.2357486 0.968 0.335267 0.944 0.94 0.08994766 0.0822351 0.964 0.968 0.1725946 0.808 0.7806709 0 0.832 0.864 0.1366073 0.2500188 ...
# check preferred models
anova(productivity24h_H0, productivity24h_H1)
anova(productivity24h_H0, productivity24h_H2)
AIC(productivity24h_H0, productivity24h_H1, productivity24h_H2) %>% arrange(AIC)
BIC(productivity24h_H0, productivity24h_H1, productivity24h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity24h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_24h
##
## AIC BIC logLik deviance df.resid
## 4722.2 4809.4 -2338.1 4676.2 304
##
##
## Dispersion parameter for genpois family (): 1.02e+03
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.0865 0.2374 21.430 < 2e-16 ***
## CommRich2 2.3515 0.2636 8.922 < 2e-16 ***
## CommRich3 2.5883 0.3589 7.212 5.53e-13 ***
## Heatcontrol 2.0961 0.2966 7.067 1.59e-12 ***
## Trtmt_Dayrecov_2 2.4692 0.2985 8.271 < 2e-16 ***
## Trtmt_Dayresist -0.1921 0.3398 -0.565 0.571747
## protegens -0.9696 0.2187 -4.433 9.30e-06 ***
## CommRich2:Heatcontrol -2.1705 0.3457 -6.279 3.40e-10 ***
## CommRich3:Heatcontrol -2.5288 0.4055 -6.236 4.50e-10 ***
## CommRich2:Trtmt_Dayrecov_2 -2.3536 0.3311 -7.108 1.18e-12 ***
## CommRich3:Trtmt_Dayrecov_2 -2.6900 0.3729 -7.214 5.42e-13 ***
## CommRich2:Trtmt_Dayresist -2.1854 0.3995 -5.470 4.49e-08 ***
## CommRich3:Trtmt_Dayresist -1.7719 0.4811 -3.683 0.000231 ***
## Heatcontrol:Trtmt_Dayrecov_2 -2.4611 0.3944 -6.240 4.38e-10 ***
## Heatcontrol:Trtmt_Dayresist 0.3992 0.4209 0.948 0.342889
## CommRich2:protegens -0.1534 0.2090 -0.734 0.462940
## CommRich3:protegens -0.1578 0.2547 -0.620 0.535416
## Heatcontrol:protegens -0.2576 0.1875 -1.374 0.169315
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 2.4513 0.4551 5.386 7.20e-08 ***
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 2.9395 0.5129 5.731 9.99e-09 ***
## CommRich2:Heatcontrol:Trtmt_Dayresist 2.1527 0.4990 4.314 1.60e-05 ***
## CommRich3:Heatcontrol:Trtmt_Dayresist 2.1074 0.5921 3.559 0.000372 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity24h_H2$frame,
predict(productivity24h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
####################
# 48h heat duration EXCLUDING EXTINCT
####################
# exclude any replicates that experienced extinction
absDen_48h <- absDen_48h %>% filter(uniqID %in% survived_uniqIDs)
productivity48h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity48h_H0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6631446 0.6966356 0.7369068 0.6004906 0.6740638 0.768344 0.7105734 0.5695334 0.5345492 0.4647756 0.327021 0.3207532 0.3101096 0.09260498 0.4486883 0.2742965 0.3440096 0.7526236 0.892 0.876 ...
productivity48h_H1 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + putida*CommRich + putida*Heat,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity48h_H1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.6100906 0.7984449 0.5938898 0.7161486 0.6207469 0.8043932 0.7515897 0.5559001 0.5083074 0.3282681 0.3957428 0.2993697 0.3547144 0.08929766 0.4378329 0.2912883 0.3900391 0.8091863 0.868 0.856 ...
productivity48h_H2 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day + protegens*CommRich + protegens*Heat,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
simulateResiduals(fittedModel = productivity48h_H2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.7492526 0.8651589 0.5050593 0.7984163 0.7625356 0.8443932 0.7850816 0.5679001 0.5123074 0.2665147 0.436028 0.3699354 0.3738308 0.09128205 0.4486883 0.2694417 0.4215329 0.8434575 0.66 0.568 ...
# check preferred models
anova(productivity48h_H0, productivity48h_H1)
anova(productivity48h_H0, productivity48h_H2)
AIC(productivity48h_H0, productivity48h_H1, productivity48h_H2) %>% arrange(AIC)
BIC(productivity48h_H0, productivity48h_H1, productivity48h_H2) %>% arrange(BIC)
# H2 is the preferred model
summary(productivity48h_H2)
## Family: genpois ( log )
## Formula:
## as.integer(TotDensity_scale * 1000) ~ CommRich * Heat * Trtmt_Day +
## protegens * CommRich + protegens * Heat
## Data: absDen_48h
##
## AIC BIC logLik deviance df.resid
## 3285.9 3368.1 -1619.9 3239.9 241
##
##
## Dispersion parameter for genpois family (): 310
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 7.23596 0.17973 40.26 < 2e-16 ***
## CommRich2 -0.08220 0.21507 -0.38 0.7023
## CommRich3 -0.41712 0.28123 -1.48 0.1380
## Heatcontrol -0.31156 0.23166 -1.34 0.1787
## Trtmt_Dayrecov_2 -0.17670 0.23627 -0.75 0.4545
## Trtmt_Dayresist -4.11585 0.58290 -7.06 1.65e-12 ***
## protegens -1.46463 0.18350 -7.98 1.44e-15 ***
## CommRich2:Heatcontrol 0.30858 0.27281 1.13 0.2580
## CommRich3:Heatcontrol 0.54530 0.32744 1.67 0.0958 .
## CommRich2:Trtmt_Dayrecov_2 -0.15913 0.28871 -0.55 0.5815
## CommRich3:Trtmt_Dayrecov_2 0.09457 0.33801 0.28 0.7796
## CommRich2:Trtmt_Dayresist 0.13268 0.71780 0.18 0.8534
## CommRich3:Trtmt_Dayresist 1.04857 0.78656 1.33 0.1825
## Heatcontrol:Trtmt_Dayrecov_2 -0.55463 0.33947 -1.63 0.1023
## Heatcontrol:Trtmt_Dayresist 4.12936 0.62074 6.65 2.89e-11 ***
## CommRich2:protegens 0.10624 0.19063 0.56 0.5773
## CommRich3:protegens 0.20448 0.23323 0.88 0.3806
## Heatcontrol:protegens -0.03236 0.15942 -0.20 0.8392
## CommRich2:Heatcontrol:Trtmt_Dayrecov_2 0.05572 0.41024 0.14 0.8920
## CommRich3:Heatcontrol:Trtmt_Dayrecov_2 0.03780 0.47541 0.08 0.9366
## CommRich2:Heatcontrol:Trtmt_Dayresist -0.27815 0.76268 -0.36 0.7153
## CommRich3:Heatcontrol:Trtmt_Dayresist -1.34658 0.84235 -1.60 0.1099
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# create data.frame for plotting
prod_predict <- cbind(productivity48h_H2$frame,
predict(productivity48h_H2, type="response"))
colnames(prod_predict)[c(1,6)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,6)] <- prod_predict[,c(1,6)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(protegens~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
# plot the effect size contingent on protegens
effect_6h_protegens <- eff_size(emmeans(productivity6h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_6h),
sigma(productivity6h_H2),
edf = df.residual(productivity6h_H2))
effect_12h_protegens <- eff_size(emmeans(productivity12h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_12h),
sigma(productivity12h_H2),
edf = df.residual(productivity12h_H2))
effect_24h_protegens <- eff_size(emmeans(productivity24h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_24h),
sigma(productivity24h_H2),
edf = df.residual(productivity24h_H2))
effect_48h_protegens <- eff_size(emmeans(productivity48h_H2, ~ Heat | CommRich*Trtmt_Day + protegens, data = absDen_48h),
sigma(productivity48h_H2),
edf = df.residual(productivity48h_H2))
# a function that extracts the confidence intervals from eff_size contingent on protegens
get_effsize_CIs <- function(eff_size_object, heat_trtmt) {
data.frame(Heat = heat_trtmt,
CommRich = confint(eff_size_object)[[2]],
Trtmt_Day = confint(eff_size_object)[[3]],
protegens = confint(eff_size_object)[[4]],
effect_est = confint(eff_size_object)[[5]],
effect_loCI = confint(eff_size_object)[[8]],
effect_hiCI = confint(eff_size_object)[[9]])
}
# create a data.frame for plotting marginal effect sizes using a forest plot
productivity_protegens <- data.frame()
productivity_protegens <- rbind(productivity_protegens,
get_effsize_CIs(effect_6h_protegens, heat_trtmt = 6),
get_effsize_CIs(effect_12h_protegens, heat_trtmt = 12),
get_effsize_CIs(effect_24h_protegens, heat_trtmt = 24),
get_effsize_CIs(effect_48h_protegens, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
productivity_protegens$Trtmt_Day <- factor(productivity_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(productivity_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
ggplot(productivity_protegens,
aes(x = effect_est, y = CommRich, colour = Trtmt_Day, shape = as.logical(protegens))) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = effect_loCI, xmax = effect_hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
shape = "protegens\npresent?",
title = "(WITHOUT extinct reps)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# only 48h has any noticable effect with protegens
# let's look at the post-hoc just for 48h with protegens
posthoc_48h_protegens <- emmeans(effect_48h_protegens,
pairwise ~ CommRich + Trtmt_Day + protegens,
data = absDen_48h)
print("Post-hoc for productivity at 48h (WITHOUT extinct replicates) conditional on protegens:")
## [1] "Post-hoc for productivity at 48h (WITHOUT extinct replicates) conditional on protegens:"
multcomp::cld(posthoc_48h_protegens, alpha=0.05/4, Letters = letters)
# okay, this makes sense. Only 48h of heat is long enough to depress total density even after 2 days of recovery
# But, in the presence of protegens, total density can bounce back. Meanwhile, in the absence of protegens, total density does not bounce back even after 2 days of recovery.
# But we are not interested in the details of protegens. Let's do the post-hoc again now averaging across the effects of protegens.
posthoc_6h <- emmeans(effect_6h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_24h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_48h <- emmeans(effect_48h_protegens,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_48h)
## NOTE: Results may be misleading due to involvement in interactions
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
productivity_effects <- data.frame()
productivity_effects <- rbind(productivity_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24),
get_posthoc(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
productivity_effects$Trtmt_Day <- factor(productivity_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(productivity_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot
ggplot(productivity_effects,
aes(x = est, y = CommRich, colour = Trtmt_Day)) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-0.015, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
title = "Averaged across protegens (WITHOUT extinct reps)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# finally, do the t-tests
# estimate the sample sizes
temp <- productivity_effects # copy the effects to temp
productivity_effects <- rbind(temp %>% filter(Heat == 6, CommRich == 1) %>% mutate(n = estimate_n(absDen_6h, CommRich=1)),
temp %>% filter(Heat == 6, CommRich == 2) %>% mutate(n = estimate_n(absDen_6h, CommRich=2)),
temp %>% filter(Heat == 6, CommRich == 3) %>% mutate(n = estimate_n(absDen_6h, CommRich=3)),
temp %>% filter(Heat == 12, CommRich == 1) %>% mutate(n = estimate_n(absDen_12h, CommRich=1)),
temp %>% filter(Heat == 12, CommRich == 2) %>% mutate(n = estimate_n(absDen_12h, CommRich=2)),
temp %>% filter(Heat == 12, CommRich == 3) %>% mutate(n = estimate_n(absDen_12h, CommRich=3)),
temp %>% filter(Heat == 24, CommRich == 1) %>% mutate(n = estimate_n(absDen_24h, CommRich=1)),
temp %>% filter(Heat == 24, CommRich == 2) %>% mutate(n = estimate_n(absDen_24h, CommRich=2)),
temp %>% filter(Heat == 24, CommRich == 3) %>% mutate(n = estimate_n(absDen_24h, CommRich=3)),
temp %>% filter(Heat == 48, CommRich == 1) %>% mutate(n = estimate_n(absDen_48h, CommRich=1)),
temp %>% filter(Heat == 48, CommRich == 2) %>% mutate(n = estimate_n(absDen_48h, CommRich=2)),
temp %>% filter(Heat == 48, CommRich == 3) %>% mutate(n = estimate_n(absDen_48h, CommRich=3)))
rm(temp)
# estimate the SD from the SE
productivity_effects <- productivity_effects %>% mutate(SD = SE * sqrt(n)) %>%
# re-order by Heat, Trtmt_Day, and CommRich
arrange(Heat, Trtmt_Day, CommRich)
# all pairwise combinations of comparisons between the same treatment day for different durations
temp <- t(combn(c(1,10,19,28), 2))
combos <- rbind(temp, temp+1, temp+2, temp+3, temp+4, temp+5, temp+6, temp+7, temp+8)
rm(temp)
# loop through all the combinations and do the t-tests
prodEffects_ttests <- data.frame()
for(i in 1:nrow(combos)){
prodEffects_ttests <- rbind(prodEffects_ttests,
run_ttest(row_x = combos[i,1],
row_y = combos[i,2],
summary_stats_df=productivity_effects))
}
prodEffects_ttests$adjusted_p <- p.adjust(prodEffects_ttests$pvalue, method = "bonferroni")
prodEffects_ttests$Trtmt_Day <- productivity_effects$Trtmt_Day[combos[,1]]
prodEffects_ttests$Heat_1 <- productivity_effects$Heat[combos[,1]]
prodEffects_ttests$Heat_2 <- productivity_effects$Heat[combos[,2]]
prodEffects_ttests$CommRich_1 <- productivity_effects$CommRich[combos[,1]]
prodEffects_ttests$CommRich_2 <- productivity_effects$CommRich[combos[,2]]
print(prodEffects_ttests)
## t_statistic df pvalue adjusted_p Trtmt_Day Heat_1
## t -0.2877406 3.988398 7.878773e-01 1.000000e+00 Recovery (Day 2) 6
## t1 3.2286617 5.907933 1.833544e-02 9.901138e-01 Recovery (Day 2) 6
## t2 -2.5934208 1.615191 1.507699e-01 1.000000e+00 Recovery (Day 2) 6
## t3 2.7378166 2.539888 8.593044e-02 1.000000e+00 Recovery (Day 2) 12
## t4 -2.2803855 1.957779 1.528798e-01 1.000000e+00 Recovery (Day 2) 12
## t5 -4.0551262 1.344916 1.034306e-01 1.000000e+00 Recovery (Day 2) 24
## t6 8.3047396 11.752375 2.944540e-06 1.590051e-04 Recovery (Day 2) 6
## t7 2.7178272 9.007959 2.367355e-02 1.000000e+00 Recovery (Day 2) 6
## t8 -3.6923493 6.739766 8.271451e-03 4.466584e-01 Recovery (Day 2) 6
## t9 -7.7527078 7.875430 5.957360e-05 3.216974e-03 Recovery (Day 2) 12
## t10 -9.3786639 7.206610 2.709884e-05 1.463337e-03 Recovery (Day 2) 12
## t11 -5.6547361 4.844433 2.659753e-03 1.436266e-01 Recovery (Day 2) 24
## t12 3.1026815 7.734006 1.522879e-02 8.223544e-01 Recovery (Day 2) 6
## t13 2.5786292 6.098306 4.124026e-02 1.000000e+00 Recovery (Day 2) 6
## t14 -0.4269523 3.418400 6.949022e-01 1.000000e+00 Recovery (Day 2) 6
## t15 -1.2424275 5.729563 2.625115e-01 1.000000e+00 Recovery (Day 2) 12
## t16 -2.4112246 3.440537 8.390105e-02 1.000000e+00 Recovery (Day 2) 12
## t17 -1.9006420 2.626553 1.664704e-01 1.000000e+00 Recovery (Day 2) 24
## t18 -2.0833373 4.013176 1.053951e-01 1.000000e+00 Recovery (Day 1) 6
## t19 10.7051125 6.930773 1.462788e-05 7.899058e-04 Recovery (Day 1) 6
## t20 -0.4490623 1.726106 7.033771e-01 1.000000e+00 Recovery (Day 1) 6
## t21 10.6118386 3.053372 1.653142e-03 8.926969e-02 Recovery (Day 1) 12
## t22 0.8952517 2.137784 4.598383e-01 1.000000e+00 Recovery (Day 1) 12
## t23 -5.8345923 1.490839 5.293039e-02 1.000000e+00 Recovery (Day 1) 24
## t24 0.8528785 11.857990 4.106231e-01 1.000000e+00 Recovery (Day 1) 6
## t25 1.3729323 8.950265 2.031879e-01 1.000000e+00 Recovery (Day 1) 6
## t26 1.3443797 7.379206 2.186848e-01 1.000000e+00 Recovery (Day 1) 6
## t27 0.2327374 7.973137 8.218285e-01 1.000000e+00 Recovery (Day 1) 12
## t28 0.6747030 7.688193 5.196334e-01 1.000000e+00 Recovery (Day 1) 12
## t29 0.6140034 4.997247 5.660851e-01 1.000000e+00 Recovery (Day 1) 24
## t30 -3.1969557 7.640380 1.346082e-02 7.268845e-01 Recovery (Day 1) 6
## t31 -2.9880255 6.178894 2.353112e-02 1.000000e+00 Recovery (Day 1) 6
## t32 1.1364521 3.415337 3.291686e-01 1.000000e+00 Recovery (Day 1) 6
## t33 1.0187419 5.606887 3.502524e-01 1.000000e+00 Recovery (Day 1) 12
## t34 3.2056389 3.620514 3.769699e-02 1.000000e+00 Recovery (Day 1) 12
## t35 2.8965007 2.648510 7.281726e-02 1.000000e+00 Recovery (Day 1) 24
## t36 6.9210843 3.441254 3.910995e-03 2.111937e-01 Resistance 6
## t37 11.7587520 7.113769 6.429325e-06 3.471836e-04 Resistance 6
## t38 9.8874550 1.312502 3.423024e-02 1.000000e+00 Resistance 6
## t39 0.7307737 2.870966 5.199872e-01 1.000000e+00 Resistance 12
## t40 7.6090488 1.428036 3.976521e-02 1.000000e+00 Resistance 12
## t41 7.6014189 1.288077 5.014566e-02 1.000000e+00 Resistance 24
## t42 5.0893220 11.322217 3.196226e-04 1.725962e-02 Resistance 6
## t43 13.7739409 11.193197 2.290040e-08 1.236622e-06 Resistance 6
## t44 18.6512707 4.574741 1.716077e-05 9.266818e-04 Resistance 6
## t45 6.2306147 9.258479 1.357538e-04 7.330708e-03 Resistance 12
## t46 16.9515317 4.739925 2.005211e-05 1.082814e-03 Resistance 12
## t47 15.5573650 4.414052 5.039663e-05 2.721418e-03 Resistance 24
## t48 -0.7525000 7.669884 4.742090e-01 1.000000e+00 Resistance 6
## t49 3.3977405 6.749968 1.213097e-02 6.550721e-01 Resistance 6
## t50 8.6764509 2.424135 6.989952e-03 3.774574e-01 Resistance 6
## t51 4.0693815 6.439359 5.669303e-03 3.061424e-01 Resistance 12
## t52 8.8640622 2.450323 6.400986e-03 3.456533e-01 Resistance 12
## t53 7.8720111 2.369567 9.453640e-03 5.104966e-01 Resistance 24
## Heat_2 CommRich_1 CommRich_2
## t 12 1 1
## t1 24 1 1
## t2 48 1 1
## t3 24 1 1
## t4 48 1 1
## t5 48 1 1
## t6 12 2 2
## t7 24 2 2
## t8 48 2 2
## t9 24 2 2
## t10 48 2 2
## t11 48 2 2
## t12 12 3 3
## t13 24 3 3
## t14 48 3 3
## t15 24 3 3
## t16 48 3 3
## t17 48 3 3
## t18 12 1 1
## t19 24 1 1
## t20 48 1 1
## t21 24 1 1
## t22 48 1 1
## t23 48 1 1
## t24 12 2 2
## t25 24 2 2
## t26 48 2 2
## t27 24 2 2
## t28 48 2 2
## t29 48 2 2
## t30 12 3 3
## t31 24 3 3
## t32 48 3 3
## t33 24 3 3
## t34 48 3 3
## t35 48 3 3
## t36 12 1 1
## t37 24 1 1
## t38 48 1 1
## t39 24 1 1
## t40 48 1 1
## t41 48 1 1
## t42 12 2 2
## t43 24 2 2
## t44 48 2 2
## t45 24 2 2
## t46 48 2 2
## t47 48 2 2
## t48 12 3 3
## t49 24 3 3
## t50 48 3 3
## t51 24 3 3
## t52 48 3 3
## t53 48 3 3
# plot again without the group labels
ggplot(productivity_effects,
aes(x = est, y = CommRich, colour = Trtmt_Day)) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
title = "Averaged across protegens (WITHOUT extinct reps)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
rm(productivity6h_H0, productivity6h_H1, productivity6h_H2,
productivity12h_H0, productivity12h_H1, productivity12h_H2,
productivity24h_H0, productivity24h_H1, productivity24h_H2,
productivity48h_H0, productivity48h_H1, productivity48h_H2,
effect_6h_protegens, effect_12h_protegens, effect_24h_protegens, effect_48h_protegens,
posthoc_6h, posthoc_12h, posthoc_24h, posthoc_48h, posthoc_48h_protegens,
productivity_protegens, prodEffects_ttests,
absDen_6h, absDen_12h, absDen_24h, absDen_48h, combos)
Let’s look at decoupling:
# rename the levels of Trtmt_Day
decoupling_productivity <- productivity_effects
levels(decoupling_productivity$Trtmt_Day) <- c("late_recov", "early_recov", "resist")
# create data.frame for plotting
decoupling_productivity <- decoupling_productivity %>%
select(-n, -SD) %>%
pivot_wider(names_from = Trtmt_Day,
values_from = c(est, loCI, hiCI, SE, groups))
# columns that indicate if resistance is significantly different from recovery
decoupling_productivity$early_recov_VS_resist <- mapply(are_groups_different,
decoupling_productivity$groups_early_recov,
decoupling_productivity$groups_resist)
decoupling_productivity$late_recov_VS_resist <- mapply(are_groups_different,
decoupling_productivity$groups_late_recov,
decoupling_productivity$groups_resist)
# clean up extra columns
decoupling_productivity <- decoupling_productivity %>% select(-groups_resist, -groups_early_recov, -groups_late_recov)
# first plot the decoupling on early recovery
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_early_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(early_recov_VS_resist))) +
geom_errorbarh(aes(xmin = loCI_resist, xmax = hiCI_resist), height=0) +
geom_errorbar(aes(ymin = loCI_early_recov, ymax = hiCI_early_recov), width=0) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
labs(title = "Decoupling of productivity (WITHOUT extinct)",
x = "Resistance +/- 95% CI",
y = "Early Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
# here's another way to plot it where the confidence intervals are shown as ellipses:
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_early_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(early_recov_VS_resist))) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
geom_ellipse(aes(x0 = est_resist,
y0 = est_early_recov,
# radius on x direction:
a = hiCI_resist - est_resist,
# radius on y direction:
b = hiCI_early_recov - est_early_recov,
angle = 0)) +
labs(title = "Decoupling of productivity (WITHOUT extinct)",
x = "Resistance +/- 95% CI",
y = "Early Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
# next plot the decoupling on later recovery
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_late_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(late_recov_VS_resist))) +
geom_errorbarh(aes(xmin = loCI_resist, xmax = hiCI_resist), height=0) +
geom_errorbar(aes(ymin = loCI_late_recov, ymax = hiCI_late_recov), width=0) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
labs(title = "Decoupling of productivity (WITHOUT extinct)",
x = "Resistance +/- 95% CI",
y = "Late Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
# calculate decoupling between resistance and early recovery
NoExtinct_early_decoupling <- t(with(decoupling_productivity,
mapply(estimate_decoupling,
resist_est = est_resist,
resist_hiCI = hiCI_resist,
recov_est = est_early_recov,
recov_hiCI = hiCI_early_recov)))
# add annotation
NoExtinct_early_decoupling <- cbind(decoupling_productivity[,1:2],
NoExtinct_early_decoupling)
ggplot(NoExtinct_early_decoupling,
aes(x = as.factor(Heat), y = est_decoupling, colour = as.factor(CommRich))) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title = "Early recovery (WITHOUT extinct reps)",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)")
# calculate decoupling between resistance and late recovery
NoExtinct_late_decoupling <- t(with(decoupling_productivity,
mapply(estimate_decoupling,
resist_est = est_resist,
resist_hiCI = hiCI_resist,
recov_est = est_late_recov,
recov_hiCI = hiCI_late_recov)))
# add annotation
NoExtinct_late_decoupling <- cbind(decoupling_productivity[,1:2],
NoExtinct_late_decoupling)
ggplot(NoExtinct_late_decoupling,
aes(x = as.factor(Heat), y = est_decoupling, colour = as.factor(CommRich))) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title = "Late recovery (WITHOUT extinct reps)",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)")
rm(productivity_effects, decoupling_productivity)
This is just for didactic reasons to show that extinct replicates have perfect coupling:
# recall the data just for the extinction prone communities
absDen_48h <- productivitySubsettedData$h48 %>% filter(community %in% extict_prone_comms)
productivity48h_H0 <- glmmTMB(as.integer(TotDensity_scale * 1000) ~ CommRich*Heat*Trtmt_Day,
data = absDen_48h,
family = genpois,
control = glmmTMBControl(optCtrl = list(iter.max = 500000,eval.max = 500000)))
# create data.frame for plotting prediction
prod_predict <- cbind(productivity48h_H0$frame,
predict(productivity48h_H0, type="response"))
colnames(prod_predict)[c(1,5)] <- c("observed", "predicted")
# and remember to divide by 1000 as we did for transforming the data
prod_predict[,c(1,5)] <- prod_predict[,c(1,5)]/1000
# plot the model predictions against the data
ggplot(prod_predict,
aes(x=Trtmt_Day, y=observed, colour=CommRich)) +
facet_grid(~Heat) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=CommRich)) +
scale_colour_viridis_d(option = "viridis", begin=0.1, end=0.85) +
labs(y="Total density (rescaled)",
colour="CommRich")
# cleanup
rm(prod_predict)
# get the effect sizes
effect_48h <- eff_size(emmeans(productivity48h_H0, ~ Heat | CommRich*Trtmt_Day, data = absDen_48h),
sigma(productivity48h_H0),
edf = df.residual(productivity48h_H0))
# do the post hoc
posthoc_48h <- emmeans(effect_48h,
pairwise ~ CommRich + Trtmt_Day,
data = absDen_48h)
## NOTE: Results may be misleading due to involvement in interactions
# create a data.frame of effect sizes using a forest plot with the group labels
productivity_effects <- data.frame()
productivity_effects <- rbind(productivity_effects,
get_posthoc(posthoc_48h, heat_trtmt = 48))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
productivity_effects$Trtmt_Day <- factor(productivity_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(productivity_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
ggplot(productivity_effects,
aes(x = est, y = CommRich, colour = Trtmt_Day)) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
scale_x_continuous(limits = c(-0.015, 0.005)) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
title = "(only extinction-prone communities)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_errorbarh()`).
ggplot(productivity_effects,
aes(x = est, y = CommRich, colour = Trtmt_Day)) +
facet_grid(. ~ Heat) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Total Density",
y = "Innoculated Community Richness",
title = "(only extinction-prone communities)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
# rename the levels of Trtmt_Day
decoupling_productivity <- productivity_effects
levels(decoupling_productivity$Trtmt_Day) <- c("late_recov", "early_recov", "resist")
# create data.frame for plotting
decoupling_productivity <- decoupling_productivity %>%
pivot_wider(names_from = Trtmt_Day,
values_from = c(est, loCI, hiCI, SE, groups))
# columns that indicate if resistance is significantly different from recovery
decoupling_productivity$early_recov_VS_resist <- mapply(are_groups_different,
decoupling_productivity$groups_early_recov,
decoupling_productivity$groups_resist)
decoupling_productivity$late_recov_VS_resist <- mapply(are_groups_different,
decoupling_productivity$groups_late_recov,
decoupling_productivity$groups_resist)
# clean up extra columns
decoupling_productivity <- decoupling_productivity %>% select(-groups_resist, -groups_early_recov, -groups_late_recov)
# first plot the decoupling on early recovery
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_early_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(early_recov_VS_resist))) +
geom_errorbarh(aes(xmin = loCI_resist, xmax = hiCI_resist), height=0) +
scale_x_continuous(limits = c(-0.015, 0.005)) +
geom_errorbar(aes(ymin = loCI_early_recov, ymax = hiCI_early_recov), width=0) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
labs(title = "(only extinction-prone communities)",
x = "Resistance +/- 95% CI",
y = "Early Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_errorbarh()`).
# next plot the decoupling on later recovery
ggplot(decoupling_productivity,
aes(x = est_resist, y = est_late_recov, colour = as.factor(Heat))) +
facet_grid(~CommRich) +
geom_hline(yintercept = 0, colour="grey") +
geom_vline(xintercept = 0, colour="grey") +
geom_abline(slope = 1) +
geom_point(shape=21, size=3, aes(fill=as.factor(late_recov_VS_resist))) +
geom_errorbarh(aes(xmin = loCI_resist, xmax = hiCI_resist), height=0) +
scale_x_continuous(limits = c(-0.015, 0.005)) +
geom_errorbar(aes(ymin = loCI_late_recov, ymax = hiCI_late_recov), width=0) +
scale_colour_viridis_d(option = "plasma", begin=0.2, end = 0.9) +
scale_fill_manual(values=c("white", "black")) +
labs(title = "(only extinction-prone communities)",
x = "Resistance +/- 95% CI",
y = "Late Recovery +/- 95% CI",
colour = "Heat\nDuration",
fill="Resistance\nvs. Recovery\nSignificantly\nDifferent?")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_errorbarh()`).
# calculate decoupling between resistance and early recovery
early_decoupling <- t(with(decoupling_productivity,
mapply(estimate_decoupling,
resist_est = est_resist,
resist_hiCI = hiCI_resist,
recov_est = est_early_recov,
recov_hiCI = hiCI_early_recov)))
# add annotation
early_decoupling <- cbind(decoupling_productivity[,1:2],
early_decoupling)
ggplot(rbind(NoExtinct_early_decoupling %>% filter(Heat == 48) %>%
mutate(data = "no extinctions"),
early_decoupling %>% mutate(data = "extinction-prone")),
aes(x = as.factor(Heat),
y = est_decoupling,
colour = as.factor(CommRich),
shape = data)) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
scale_y_continuous(limits = c(-0.005, 0.015))+
labs(title = "effect of extinction on early recovery",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)",
shape = "communities")
ggplot(rbind(NoExtinct_early_decoupling %>% filter(Heat == 48) %>%
mutate(data = "no extinctions"),
early_decoupling %>% mutate(data = "extinction-prone")),
aes(x = as.factor(Heat),
y = est_decoupling,
colour = as.factor(CommRich),
shape = data)) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
scale_y_continuous(limits = c(-0.005, 0.015))+
labs(title = "effect of extinction on early recovery",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)",
shape = "communities")
# calculate decoupling between resistance and late recovery
late_decoupling <- t(with(decoupling_productivity,
mapply(estimate_decoupling,
resist_est = est_resist,
resist_hiCI = hiCI_resist,
recov_est = est_late_recov,
recov_hiCI = hiCI_late_recov)))
# add annotation
late_decoupling <- cbind(decoupling_productivity[,1:2],
late_decoupling)
ggplot(rbind(NoExtinct_late_decoupling %>% filter(Heat == 48) %>%
mutate(data = "no extinctions"),
late_decoupling %>% mutate(data = "extinction-prone")),
aes(x = as.factor(Heat),
y = est_decoupling,
colour = as.factor(CommRich),
shape = data)) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
scale_y_continuous(limits = c(-0.005, 0.015))+
labs(title = "effect of extinction on late recovery",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)",
shape = "communities")
ggplot(rbind(NoExtinct_late_decoupling %>% filter(Heat == 48) %>%
mutate(data = "no extinctions"),
late_decoupling %>% mutate(data = "extinction-prone")),
aes(x = as.factor(Heat),
y = est_decoupling,
colour = as.factor(CommRich),
shape = data)) +
geom_hline(yintercept = 0, colour = "grey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbar(position = position_dodge(width = 0.5),
aes(ymin = loCI_decoupling, ymax = hiCI_decoupling),
alpha=0.4, width=0.1) +
scale_colour_viridis_d(option = "viridis", end=0.85) +
labs(title = "effect of extinction on late recovery",
colour = "Innoculated\nCommunity\nRichness",
y = "Decoupling +/- 95% CI",
x = "Heat Duration (hrs)",
shape = "communities")
Maddy suggested that I could do some kind of analysis like this: Loreau & Hector 2001. The idea is to investigate what role biodiversity has on the community by splitting up selection effects (the tendency that as biodiversity goes up so does the probability that the “best” species will dominate the community) and complementarity effects (the tendency that as biodiversity goes up so does the possibility of niche partitioning between different species). When complementarity is positive, this means that the average density of each species in mixed communities is greater than expected from monocultures and this suggests resource partitioning or facilitation between species. When complementarity is negative, the average density of each species in mixed communities is less than expected from monocultures and this suggests (physical or chemical) interference between species. A positive (or negative) selection effect happens when species with the highest (or lowest) monoculture densities dominate the mixed communities. Finally, the sum of complementarity and selection effects is called the net biodiversity effect. It measures how much the total density in mixed communities deviates from the monoculture expectations.
Biodiversity effects can readily be calculated using the R package
partitionBEFsp.
I will use the monoculture densities within each Heat:Day treatment to calculate the biodiversity effects.
# this should have been removed ages ago as it is not informative...
absDensity <- absDensity %>% filter(community != "0_0_0_0")
#
# a wrapper to feed input to partitionBEFsp functions
# monocult (and cocult) correspond to a specific Heat, Day treatment (and community)
get_biodiversity_effects_vector <- function(monocult, cocult) {
# get the monoculture biomass for only the species present in cocult
M <- monocult$monoMeanDensity[unlist(cocult[,3:6])]
# get the coculture biomass for the appropriate species
P <- unlist(cocult[7:10])[unlist(cocult[,3:6])]
# sanity check that M and P are the same length
if(length(M) != length(P))
stop("unequal lengths detected for M and P")
# if any of the monocultures are NA, then return NA for both S and C
if(any(is.na(M)))
return(c(S=NA, C=NA))
# if any of the monocultures at 0, then return NaN for both S and C
if(any(M == 0))
return(c(S=NaN, C=NaN))
# I am over-writing the default behaviour of class_partition because this can
# yield NaN, Inf, or NA values depending on the details of the data
# (i.e., probably because of undefined or division by 0)
# calculate the biodiversity effects
return(unlist(classic_partition(DRY=calculate_DRY(P, M, Q=length(M)), M)))
}
## NOTE: when at least 1 monoculture is NA, it will yield NA values for both S & C
## However, when at least 1 monoculture is 0, it will yield NaN for both S & C
# a function to estimate biodiversity effects within each Heat:Day treatment
get_biodiversity_effects_forDF <- function(absoluteDensity_df) {
# estimate biomass in monocultures for each Day and Heat treatment
monoculture_biomass <- absoluteDensity_df %>% filter(Day > 0, CommRich == 1) %>%
ungroup() %>% group_by(Heat, Day, community) %>%
summarise(mean_putida = mean(Conc_putida, na.rm = TRUE),
mean_protegens = mean(Conc_protegens, na.rm = TRUE),
mean_grimontii = mean(Conc_grimontii, na.rm=TRUE),
mean_veronii = mean(Conc_veronii, na.rm=TRUE),
.groups="keep") %>% # shuts up the warning
pivot_longer(cols=starts_with("mean_"),
names_to="species",
names_prefix="mean_",
values_to="monoMeanDensity")
# drop the redundant rows (e.g., Protegens value for the putida monoculture community 1_0_0_0)
monoculture_biomass <- monoculture_biomass[-which(monoculture_biomass$community == "1_0_0_0" & monoculture_biomass$species != "putida"),]
monoculture_biomass <- monoculture_biomass[-which(monoculture_biomass$community == "0_1_0_0" & monoculture_biomass$species != "protegens"),]
monoculture_biomass <- monoculture_biomass[-which(monoculture_biomass$community == "0_0_1_0" & monoculture_biomass$species != "grimontii"),]
monoculture_biomass <- monoculture_biomass[-which(monoculture_biomass$community == "0_0_0_1" & monoculture_biomass$species != "veronii"),]
# and now we can drop the redundant species column
monoculture_biomass <- monoculture_biomass %>% select(-species)
# put NA values to indicate missing monoculture data
# create a temporary variable with all possible combinations of treatments
combos <- expand.grid(unique(monoculture_biomass$Heat),
unique(monoculture_biomass$Day),
unique(monoculture_biomass$community))
colnames(combos) <- c("Heat", "Day", "community")
NA_rows <- anti_join(combos, monoculture_biomass,
by = c("Heat", "Day", "community")) %>% mutate(monoMeanDensity = NA)
monoculture_biomass <- rbind(monoculture_biomass, NA_rows)
rm(combos, NA_rows)
# separate community column into 4 columns indicating the species (with values 1 or 0)
monoculture_biomass <- monoculture_biomass %>%
arrange(desc(community)) %>% # re-order with putida first
separate_wider_delim(community, "_", names = c("putida", "protegens", "grimontii", "veronii"))
# loop through the communities, Day, and Heat treatments to calculate biodiversity effects
cocultures <- absoluteDensity_df %>% filter(Day > 0, CommRich > 1) %>%
ungroup() %>% group_by(Heat, Day, putida, protegens, grimontii, veronii) %>%
summarise(Conc_putida = mean(Conc_putida, na.rm = TRUE),
Conc_protegens = mean(Conc_protegens, na.rm = TRUE),
Conc_grimontii = mean(Conc_grimontii, na.rm=TRUE),
Conc_veronii = mean(Conc_veronii, na.rm=TRUE),
.groups="keep") # shuts up the warning
# convert the columns indicating species presence/absence into logical
cocultures$putida <- as.logical(cocultures$putida)
cocultures$protegens <- as.logical(cocultures$protegens)
cocultures$grimontii <- as.logical(cocultures$grimontii)
cocultures$veronii <- as.logical(cocultures$veronii)
# finally, loop through the days & heat treatments
biodiv_df <- data.frame()
for(heat in unique(cocultures$Heat)){
for(day in unique(cocultures$Day[which(cocultures$Heat == heat)])) {
temp_monocult <- monoculture_biomass %>% filter(Heat == heat, Day == day)
temp_cocult <- cocultures %>% filter(Heat == heat, Day == day)
for(i in 1:nrow(temp_cocult)){
biodiv_df <- rbind(biodiv_df,
temp_cocult[i,] %>% mutate(S = get_biodiversity_effects_vector(monocult = temp_monocult, cocult = .)[1],
C = get_biodiversity_effects_vector(monocult = temp_monocult, cocult = .)[2]))
}
rm(temp_monocult, temp_cocult)
}
}
return(biodiv_df)
}
biodiversityEffects_df <- get_biodiversity_effects_forDF(absoluteDensity_df = absDensity)
# add the annotation back to biodiversity effects df
biodiversityEffects_df <- inner_join(biodiversityEffects_df, absDensity %>% select(-uniqID, -Conc_putida, -Conc_protegens, -Conc_grimontii, -Conc_veronii, -Total_density, -Diversity) %>% filter(Day > 0, CommRich > 1)) %>% distinct()
## Joining with `by = join_by(Heat, Day, putida, protegens, grimontii, veronii)`
ggplot(biodiversityEffects_df,
aes(x=as.factor(Day), y=S, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_beeswarm(alpha=0.5) +
scale_colour_viridis_d(option = "viridis", begin=0.2, end=0.95) +
geom_hline(yintercept = 0, colour="grey")
## Warning: Removed 43 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(biodiversityEffects_df,
aes(x=as.factor(Day), y=C, colour=as.factor(CommRich))) +
facet_grid(protegens~Heat) +
geom_beeswarm(alpha=0.5) +
scale_colour_viridis_d(option = "viridis", begin=0.2, end=0.95) +
geom_hline(yintercept = 0, colour="grey")
## Warning: Removed 43 rows containing missing values or values outside the scale range
## (`geom_point()`).
# back-up a little bit to look at the monoculture densities (aka carrying capacities)
# let's describe the communities in the absence of heat
selection_control0 <- glmmTMB(S ~ as.factor(CommRich) + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = selection_control0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.968 0.02 0.752 0.392 0.552 0.912 0.948 0.172 0.028 0.576 0.364 0.884 0.352 0.72 0.508 0 0.924 0.252 0.528 0.452 ...
selection_control1 <- glmmTMB(S ~ putida + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = selection_control1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.972 0.028 0.764 0.24 0.652 0.956 0.924 0.224 0.024 0.548 0.256 0.896 0.388 0.74 0.4 0 0.964 0.224 0.632 0.408 ...
selection_control2 <- glmmTMB(S ~ protegens + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = selection_control2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.956 0.076 0.868 0.436 0.432 0.864 0.848 0.248 0.028 0.608 0.324 0.828 0.52 0.868 0.564 0 0.9 0.072 0.716 0.484 ...
selection_control3 <- glmmTMB(S ~ protegens + CommRich + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = selection_control3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.952 0.06 0.844 0.456 0.42 0.848 0.892 0.228 0.028 0.632 0.384 0.808 0.48 0.84 0.58 0 0.896 0.092 0.672 0.492 ...
AIC(selection_control0, selection_control1, selection_control2, selection_control3) %>% arrange(AIC)
BIC(selection_control0, selection_control1, selection_control2, selection_control3) %>% arrange(BIC)
summary(selection_control2)
## Family: gaussian ( identity )
## Formula: S ~ protegens + (1 | Day)
## Data: biodiversityEffects_df %>% filter(Heat == 0)
##
## AIC BIC logLik deviance df.resid
## 777.1 785.2 -384.6 769.1 51
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## Day (Intercept) 1.561 1.249
## Residual 69302.571 263.254
## Number of obs: 55, groups: Day, 5
##
## Dispersion estimate for gaussian family (sigma^2): 6.93e+04
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -201.19 58.87 -3.418 0.000632 ***
## protegens -201.81 73.79 -2.735 0.006240 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ggplot(biodiversityEffects_df %>% filter(Heat == 0) %>%
plyr::mutate(pred = predict(selection_control3)),
aes(x=as.factor(Day), y=S, colour=as.factor(CommRich))) +
facet_grid(~protegens) +
geom_beeswarm(alpha=0.5) +
geom_line(aes(y=pred, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.2, end=0.95) +
geom_hline(yintercept = 0, colour="grey") +
labs(title = "not the preferred model!")
# and also for complementarity
comp_control0 <- glmmTMB(C ~ as.factor(CommRich) + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = comp_control0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.208 0.36 0.3 0.296 0.724 0.208 0.544 0.6 0.812 0.548 0.576 0.024 0.056 0.24 0.156 1 0.988 1 0.32 0.228 ...
comp_control1 <- glmmTMB(C ~ putida + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = comp_control1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.376 0.516 0.452 0.516 0.584 0.128 0.448 0.476 0.78 0.42 0.444 0.032 0.088 0.428 0.292 1 0.98 1 0.24 0.152 ...
comp_control2 <- glmmTMB(C ~ protegens + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = comp_control2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.06 0.436 0.388 0.468 0.464 0.072 0.356 0.712 0.928 0.664 0.648 0 0.068 0.36 0.22 1 0.948 1 0.448 0.336 ...
comp_control3 <- glmmTMB(C ~ protegens + CommRich + (1|Day),
data = biodiversityEffects_df %>% filter(Heat == 0))
simulateResiduals(fittedModel = comp_control3, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.068 0.516 0.448 0.452 0.508 0.076 0.292 0.764 0.916 0.652 0.536 0 0.08 0.428 0.196 1 0.964 1 0.52 0.316 ...
AIC(comp_control0, comp_control1, comp_control2, comp_control3) %>% arrange(AIC)
BIC(comp_control0, comp_control1, comp_control2, comp_control3) %>% arrange(BIC)
summary(comp_control2)
## Family: gaussian ( identity )
## Formula: C ~ protegens + (1 | Day)
## Data: biodiversityEffects_df %>% filter(Heat == 0)
##
## AIC BIC logLik deviance df.resid
## 744.1 752.1 -368.0 736.1 51
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## Day (Intercept) 8.604e-05 9.276e-03
## Residual 3.801e+04 1.950e+02
## Number of obs: 55, groups: Day, 5
##
## Dispersion estimate for gaussian family (sigma^2): 3.8e+04
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 204.24 43.59 4.685 2.8e-06 ***
## protegens -165.07 54.65 -3.021 0.00252 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ggplot(biodiversityEffects_df %>% filter(Heat == 0) %>%
plyr::mutate(pred = predict(comp_control3)),
aes(x=as.factor(Day), y=C, colour=as.factor(CommRich))) +
facet_grid(~protegens) +
geom_beeswarm(alpha=0.5) +
geom_line(aes(y=pred, group=as.factor(CommRich))) +
scale_colour_viridis_d(option = "viridis", begin=0.2, end=0.95) +
geom_hline(yintercept = 0, colour="grey") +
labs(title = "not the preferred model!")
#########
# get effect sizes for selection and complementarity
#########
# scale the data by its standard deviation
biodiversityEffects_df$S_scale <- scale(biodiversityEffects_df$S,
scale = sd(biodiversityEffects_df$S, na.rm = TRUE),
center = FALSE)
biodiversityEffects_df$C_scale <- scale(biodiversityEffects_df$C,
scale = sd(biodiversityEffects_df$C, na.rm = TRUE),
center = FALSE)
# re-arrange the levels so that emmeans can be run:
biodiversityEffects_df$Heat <- as.character(biodiversityEffects_df$Heat)
biodiversityEffects_df$Heat[which(biodiversityEffects_df$Heat == 0)] <- "control"
# !!! emmeans expects the control to be the very *last* level !!!
biodiversityEffects_df$Heat <- factor(biodiversityEffects_df$Heat,
levels = c("6", "12", "24", "48", "control"))
# split the data up and fit it to LM's
####################
# 6h heat duration
####################
# grab just the treatment with its associated control data
biodiv_6h <- rbind(biodiversityEffects_df %>% filter(Heat == "6"),
biodiversityEffects_df %>% filter(Heat == "control", Day < 4))
# create a column for last day of heat, first day of recovery, and last day of recovery
biodiv_6h$Trtmt_Day <- "resist"
biodiv_6h$Trtmt_Day[biodiv_6h$Day == 2] <- "recov_1"
biodiv_6h$Trtmt_Day[biodiv_6h$Day == 3] <- "recov_2"
sel_6h_0 <- glmmTMB(S_scale ~ Heat*as.factor(CommRich)*Trtmt_Day,
data = biodiv_6h)
simulateResiduals(fittedModel = sel_6h_0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.528 0 0.648 0.22 0.932 1 0.988 0.028 0.032 0.568 0.496 0.8 0.172 0.424 0.272 0.22 0.412 0.388 0.86 0.644 ...
sel_6h_1 <- glmmTMB(S_scale ~ Heat*as.factor(putida)*Trtmt_Day,
data = biodiv_6h)
simulateResiduals(fittedModel = sel_6h_1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.816 0.052 0.896 0.364 0.912 1 0.956 0.028 0.004 0.42 0.064 0.856 0.232 0.508 0.316 0.22 0.412 0.348 0.86 0.608 ...
sel_6h_2 <- glmmTMB(S_scale ~ Heat*protegens*Trtmt_Day,
data = biodiv_6h)
simulateResiduals(fittedModel = sel_6h_2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.024 0.048 1 0.496 0.468 0.952 0.652 0.188 0.14 0.932 0.396 0.908 0.112 0.432 0.188 0.184 0.468 0.348 0.924 0.644 ...
AIC(sel_6h_0, sel_6h_1, sel_6h_2) %>% arrange(AIC)
BIC(sel_6h_0, sel_6h_1, sel_6h_2) %>% arrange(BIC)
# create data.frame for plotting
S_predict <- cbind(sel_6h_2$frame,
predict(sel_6h_2, type="response"))
colnames(S_predict)[c(1, 3, 5)] <- c("observed", "protegens", "predicted")
# plot the model predictions against the data
ggplot(S_predict,
aes(x=Trtmt_Day, y=observed, colour=Heat)) +
facet_grid(~protegens) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=Heat)) +
labs(y="Selection effect (rescaled)",
colour="Heat")
# cleanup
rm(S_predict)
comp_6h_0 <- glmmTMB(C_scale ~ Heat*as.factor(CommRich)*Trtmt_Day,
data = biodiv_6h)
simulateResiduals(fittedModel = comp_6h_0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.196 0.928 0.676 0.88 0.248 0.104 0.02 0.792 0.868 0.464 0.496 0.732 0.284 0.296 0.528 0.552 0.92 0.796 0.192 0.332 ...
comp_6h_1 <- glmmTMB(C_scale ~ Heat*as.factor(putida)*Trtmt_Day,
data = biodiv_6h)
simulateResiduals(fittedModel = comp_6h_1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.048 0.792 0.42 0.772 0.296 0.116 0.02 0.844 0.908 0.52 0.944 0.82 0.344 0.344 0.428 0.596 0.936 0.76 0.196 0.228 ...
comp_6h_2 <- glmmTMB(C_scale ~ Heat*protegens*Trtmt_Day,
data = biodiv_6h)
simulateResiduals(fittedModel = comp_6h_2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.724 0.764 0.272 0.752 0.764 0.508 0.092 0.456 0.48 0.108 0.684 0.46 0.492 0.552 0.688 0.288 0.748 0.468 0.316 0.416 ...
AIC(comp_6h_0, comp_6h_1, comp_6h_2) %>% arrange(AIC)
BIC(comp_6h_0, comp_6h_1, comp_6h_2) %>% arrange(BIC)
# create data.frame for plotting
C_predict <- cbind(comp_6h_2$frame,
predict(comp_6h_2, type="response"))
colnames(C_predict)[c(1, 3, 5)] <- c("observed", "protegens", "predicted")
# plot the model predictions against the data
ggplot(C_predict,
aes(x=Trtmt_Day, y=observed, colour=Heat)) +
facet_grid(~protegens) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=Heat)) +
labs(y="Complementarity effect (rescaled)",
colour="Heat")
# cleanup
rm(C_predict)
####################
# 12h heat duration
####################
# grab just the treatment with its associated control data
biodiv_12h <- rbind(biodiversityEffects_df %>% filter(Heat == "12", Day > 1),
biodiversityEffects_df %>% filter(Heat == "control", Day > 1, Day !=5))
# create a column for last day of heat, first day of recovery, and last day of recovery
biodiv_12h$Trtmt_Day <- "resist"
biodiv_12h$Trtmt_Day[biodiv_12h$Day == 3] <- "recov_1"
biodiv_12h$Trtmt_Day[biodiv_12h$Day == 4] <- "recov_2"
sel_12h_0 <- glmmTMB(S_scale ~ Heat*as.factor(CommRich)*Trtmt_Day,
data = biodiv_12h)
simulateResiduals(fittedModel = sel_12h_0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.4 0.444 0.376 0.48 0.868 0.788 0.876 0.104 0.24 0.296 0.496 0.736 0.012 0.376 0.252 0.596 0.936 0.784 0.604 0.336 ...
sel_12h_1 <- glmmTMB(S_scale ~ Heat*putida*Trtmt_Day,
data = biodiv_12h)
simulateResiduals(fittedModel = sel_12h_1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.468 0.5 0.492 0.508 0.876 0.812 0.852 0.12 0.228 0.284 0.244 0.916 0.056 0.688 0.4 0.556 0.924 0.656 0.58 0.22 ...
sel_12h_2 <- glmmTMB(S_scale ~ Heat*protegens*Trtmt_Day,
data = biodiv_12h)
simulateResiduals(fittedModel = sel_12h_2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.176 0.684 0.624 0.632 0.692 0.532 0.624 0.216 0.38 0.424 0.408 0.436 0.04 0.72 0.4 0.336 0.728 0.416 0.86 0.532 ...
AIC(sel_12h_0, sel_12h_1, sel_12h_2) %>% arrange(AIC)
BIC(sel_12h_0, sel_12h_1, sel_12h_2) %>% arrange(BIC)
# create data.frame for plotting
S_predict <- cbind(sel_12h_2$frame,
predict(sel_12h_2, type="response"))
colnames(S_predict)[c(1, 3, 5)] <- c("observed", "protegens", "predicted")
# plot the model predictions against the data
ggplot(S_predict,
aes(x=Trtmt_Day, y=observed, colour=Heat)) +
facet_grid(~protegens) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=Heat)) +
labs(y="Selection effect (rescaled)",
colour="Heat")
# cleanup
rm(S_predict)
comp_12h_0 <- glmmTMB(C_scale ~ Heat*as.factor(CommRich)*Trtmt_Day,
data = biodiv_12h)
simulateResiduals(fittedModel = comp_12h_0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.764 0.496 0.496 0.532 0.496 0.356 0.484 0.384 0.376 0.488 0.496 0.424 0.82 0.672 0.548 0.204 0.376 0.616 0.468 0.456 ...
comp_12h_1 <- glmmTMB(C_scale ~ Heat*putida*Trtmt_Day,
data = biodiv_12h)
simulateResiduals(fittedModel = comp_12h_1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.744 0.42 0.396 0.432 0.572 0.412 0.484 0.444 0.368 0.484 0.628 0.244 0.712 0.484 0.5 0.148 0.332 0.68 0.416 0.6 ...
comp_12h_2 <- glmmTMB(C_scale ~ Heat*protegens*Trtmt_Day,
data = biodiv_12h)
simulateResiduals(fittedModel = comp_12h_2, plot = TRUE)
## qu = 0.75, log(sigma) = -2.71628 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.784 0.52 0.528 0.504 0.468 0.288 0.416 0.396 0.332 0.444 0.624 0.464 0.752 0.488 0.5 0.188 0.412 0.812 0.26 0.42 ...
AIC(comp_12h_0, comp_12h_1, comp_12h_2) %>% arrange(AIC)
BIC(comp_12h_0, comp_12h_1, comp_12h_2) %>% arrange(BIC)
# create data.frame for plotting
C_predict <- cbind(comp_12h_2$frame,
predict(comp_12h_2, type="response"))
colnames(C_predict)[c(1, 3, 5)] <- c("observed", "protegens", "predicted")
# plot the model predictions against the data
ggplot(C_predict,
aes(x=Trtmt_Day, y=observed, colour=Heat)) +
facet_grid(~protegens) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=Heat)) +
labs(y="Complementarity effect (rescaled)",
colour="Heat")
# cleanup
rm(C_predict)
####################
# 24h heat duration
####################
# grab just the treatment with its associated control data
biodiv_24h <- rbind(biodiversityEffects_df %>% filter(Heat == "24", Day > 1),
biodiversityEffects_df %>% filter(Heat == "control", Day > 1, Day !=5))
# create a column for last day of heat, first day of recovery, and last day of recovery
biodiv_24h$Trtmt_Day <- "resist"
biodiv_24h$Trtmt_Day[biodiv_24h$Day == 3] <- "recov_1"
biodiv_24h$Trtmt_Day[biodiv_24h$Day == 4] <- "recov_2"
sel_24h_0 <- glmmTMB(S_scale ~ Heat*as.factor(CommRich)*Trtmt_Day,
data = biodiv_24h)
## dropping columns from rank-deficient conditional model: Heatcontrol:as.factor(CommRich)4:Trtmt_Dayresist
simulateResiduals(fittedModel = sel_24h_0, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.544 0.512 0.548 0.516 0.012 0.436 0.428 0.448 0.996 0.98 0.984 0.036 0.16 0.136 0.524 0.696 0.116 0.216 0.544 0.94 ...
sel_24h_1 <- glmmTMB(S_scale ~ Heat*putida*Trtmt_Day,
data = biodiv_24h)
simulateResiduals(fittedModel = sel_24h_1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.52 0.532 0.556 0.528 0.064 0.668 0.64 0.688 0.976 0.952 0.972 0.016 0.108 0.092 0.156 0.856 0.264 0.38 0.384 0.972 ...
sel_24h_2 <- glmmTMB(S_scale ~ Heat*protegens*Trtmt_Day,
data = biodiv_24h)
simulateResiduals(fittedModel = sel_24h_2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.544 0.552 0.54 0.484 0 0.772 0.776 0.784 0.948 0.88 0.944 0.056 0.412 0.284 0.448 0.136 0.576 0.724 0.78 0.572 ...
AIC(sel_24h_0, sel_24h_1, sel_24h_2) %>% arrange(AIC)
BIC(sel_24h_0, sel_24h_1, sel_24h_2) %>% arrange(BIC)
# create data.frame for plotting
S_predict <- cbind(sel_24h_2$frame,
predict(sel_24h_2, type="response"))
colnames(S_predict)[c(1, 3, 5)] <- c("observed", "protegens", "predicted")
# plot the model predictions against the data
ggplot(S_predict,
aes(x=Trtmt_Day, y=observed, colour=Heat)) +
facet_grid(~protegens) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=Heat)) +
labs(y="Selection effect (rescaled)",
colour="Heat")
# cleanup
rm(S_predict)
comp_24h_0 <- glmmTMB(C_scale ~ Heat*as.factor(CommRich)*Trtmt_Day,
data = biodiv_24h)
## dropping columns from rank-deficient conditional model: Heatcontrol:as.factor(CommRich)4:Trtmt_Dayresist
simulateResiduals(fittedModel = comp_24h_0, plot = TRUE)
## qu = 0.25, log(sigma) = -3.308981 : outer Newton did not converge fully.
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.528 0.568 0.512 0.516 1 0.188 0.188 0.536 0.34 0.344 0.72 0.104 0.416 0.332 0.524 0.82 0.42 0.372 0.216 0.512 ...
comp_24h_1 <- glmmTMB(C_scale ~ Heat*putida*Trtmt_Day,
data = biodiv_24h)
simulateResiduals(fittedModel = comp_24h_1, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.52 0.588 0.528 0.504 1 0.104 0.104 0.124 0.628 0.692 0.684 0.288 0.332 0.28 0.564 0.856 0.428 0.38 0.256 0.564 ...
comp_24h_2 <- glmmTMB(C_scale ~ Heat*protegens*Trtmt_Day,
data = biodiv_24h)
simulateResiduals(fittedModel = comp_24h_2, plot = TRUE)
## Object of Class DHARMa with simulated residuals based on 250 simulations with refit = FALSE . See ?DHARMa::simulateResiduals for help.
##
## Scaled residual values: 0.536 0.552 0.52 0.504 1 0.54 0.512 0.556 0.116 0.156 0.14 0.364 0.404 0.332 0.668 0.792 0.548 0.436 0.328 0.364 ...
AIC(comp_24h_0, comp_24h_1, comp_24h_2) %>% arrange(AIC)
BIC(comp_24h_0, comp_24h_1, comp_24h_2) %>% arrange(BIC)
# create data.frame for plotting
C_predict <- cbind(comp_24h_2$frame,
predict(comp_24h_2, type="response"))
colnames(C_predict)[c(1, 3, 5)] <- c("observed", "protegens", "predicted")
# plot the model predictions against the data
ggplot(C_predict,
aes(x=Trtmt_Day, y=observed, colour=Heat)) +
facet_grid(~protegens) +
geom_jitter(alpha=0.4) +
geom_line(aes(y=predicted, group=Heat)) +
labs(y="Complementarity effect (rescaled)",
colour="Heat")
# cleanup
rm(C_predict)
#######################
# effect sizes for selection effect
#######################
# plot the effect size contingent on protegens
effect_6h_protegens <- eff_size(emmeans(sel_6h_2, ~ Heat | Trtmt_Day*protegens, data = biodiv_6h),
sigma(sel_6h_2),
edf = df.residual(sel_6h_2))
effect_12h_protegens <- eff_size(emmeans(sel_12h_2, ~ Heat | Trtmt_Day*protegens, data = biodiv_12h),
sigma(sel_12h_2),
edf = df.residual(sel_12h_2))
effect_24h_protegens <- eff_size(emmeans(sel_24h_2, ~ Heat | Trtmt_Day*protegens, data = biodiv_24h),
sigma(sel_24h_2),
edf = df.residual(sel_24h_2))
# a function that extracts the confidence intervals from eff_size contingent on protegens
get_effsize_CIs <- function(eff_size_object, heat_trtmt) {
data.frame(Heat = heat_trtmt,
Trtmt_Day = confint(eff_size_object)[[2]],
protegens = confint(eff_size_object)[[3]],
effect_est = confint(eff_size_object)[[4]],
effect_loCI = confint(eff_size_object)[[7]],
effect_hiCI = confint(eff_size_object)[[8]])
}
# create a data.frame for plotting marginal effect sizes using a forest plot
sel_protegens <- data.frame()
sel_protegens <- rbind(sel_protegens,
get_effsize_CIs(effect_6h_protegens, heat_trtmt = 6),
get_effsize_CIs(effect_12h_protegens, heat_trtmt = 12),
get_effsize_CIs(effect_24h_protegens, heat_trtmt = 24))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
sel_protegens$Trtmt_Day <- factor(sel_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(sel_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
ggplot(sel_protegens,
aes(x = effect_est, y = as.factor(Heat), colour = Trtmt_Day, shape = as.logical(protegens))) +
facet_grid(. ~ protegens) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = effect_loCI, xmax = effect_hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Selection Effect",
y = "Heat duration",
shape = "protegens\npresent?")
# average across the effect of protegens
posthoc_6h <- emmeans(effect_6h_protegens,
pairwise ~ Trtmt_Day,
data = biodiv_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h_protegens,
pairwise ~ Trtmt_Day,
data = biodiv_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h_protegens,
pairwise ~ Trtmt_Day,
data = biodiv_24h)
## NOTE: Results may be misleading due to involvement in interactions
# a function that extracts the confidence intervals from a post-hoc object
get_posthoc <- function(posthoc_object, heat_trtmt) {
output <- multcomp::cld(posthoc_object, alpha=0.05/3, Letters = letters) %>%
data.frame() %>%
select(-df)
colnames(output)[2:6] <- c("est", "SE", "loCI", "hiCI", "groups")
output$Heat <- heat_trtmt
return(output)
}
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
sel_effects <- data.frame()
sel_effects <- rbind(sel_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
sel_effects$Trtmt_Day <- factor(sel_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(sel_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot
ggplot(sel_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day)) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-2, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Selection effect",
y = "Heat",
title = "Averaged across protegens")
# the sample sizes are fixed to the number of non-monoculture communities (which is 11)
sel_effects$n <- 11
# estimate the SD from the SE
sel_effects <- sel_effects %>% mutate(SD = SE * sqrt(n)) %>%
# re-order by Heat and Trtmt_Day
arrange(Heat, Trtmt_Day)
# all pairwise combinations of comparisons between the same treatment day for different durations
temp <- t(combn(c(1,4,7), 2))
combos <- rbind(temp, temp+1, temp+2)
rm(temp)
# loop through all the combinations and do the t-tests
sel_effects_ttests <- data.frame()
for(i in 1:nrow(combos)){
sel_effects_ttests <- rbind(sel_effects_ttests,
run_ttest(row_x = combos[i,1],
row_y = combos[i,2],
summary_stats_df = sel_effects))
}
sel_effects_ttests$adjusted_p <- p.adjust(sel_effects_ttests$pvalue, method = "bonferroni")
sel_effects_ttests$Trtmt_Day <- sel_effects$Trtmt_Day[combos[,1]]
sel_effects_ttests$Heat_1 <- sel_effects$Heat[combos[,1]]
sel_effects_ttests$Heat_2 <- sel_effects$Heat[combos[,2]]
print(sel_effects_ttests)
## t_statistic df pvalue adjusted_p Trtmt_Day Heat_1
## t 3.743394 19.97977 1.282738e-03 1.154465e-02 Recovery (Day 2) 6
## t1 -5.356732 19.99884 3.044592e-05 2.740133e-04 Recovery (Day 2) 6
## t2 -9.001005 19.98830 1.809828e-08 1.628846e-07 Recovery (Day 2) 12
## t3 -2.961968 19.99813 7.707862e-03 6.937076e-02 Recovery (Day 1) 6
## t4 -13.050617 19.99807 3.047147e-11 2.742433e-10 Recovery (Day 1) 6
## t5 -10.137594 20.00000 2.512998e-09 2.261698e-08 Recovery (Day 1) 12
## t6 -12.568552 19.99978 5.975158e-11 5.377643e-10 Resistance 6
## t7 -8.581083 17.53946 1.089207e-07 9.802865e-07 Resistance 6
## t8 1.343844 17.50674 1.961577e-01 1.000000e+00 Resistance 12
## Heat_2
## t 12
## t1 24
## t2 24
## t3 12
## t4 24
## t5 24
## t6 12
## t7 24
## t8 24
#######################
# effect sizes for complementarity effect
#######################
# plot the effect size contingent on protegens
effect_6h_protegens <- eff_size(emmeans(comp_6h_2, ~ Heat | Trtmt_Day*protegens, data = biodiv_6h),
sigma(comp_6h_2),
edf = df.residual(comp_6h_2))
effect_12h_protegens <- eff_size(emmeans(comp_12h_2, ~ Heat | Trtmt_Day*protegens, data = biodiv_12h),
sigma(comp_12h_2),
edf = df.residual(comp_12h_2))
effect_24h_protegens <- eff_size(emmeans(comp_24h_2, ~ Heat | Trtmt_Day*protegens, data = biodiv_24h),
sigma(comp_24h_2),
edf = df.residual(comp_24h_2))
# a function that extracts the confidence intervals from eff_size contingent on protegens
get_effsize_CIs <- function(eff_size_object, heat_trtmt) {
data.frame(Heat = heat_trtmt,
Trtmt_Day = confint(eff_size_object)[[2]],
protegens = confint(eff_size_object)[[3]],
effect_est = confint(eff_size_object)[[4]],
effect_loCI = confint(eff_size_object)[[7]],
effect_hiCI = confint(eff_size_object)[[8]])
}
# create a data.frame for plotting marginal effect sizes using a forest plot
comp_protegens <- data.frame()
comp_protegens <- rbind(comp_protegens,
get_effsize_CIs(effect_6h_protegens, heat_trtmt = 6),
get_effsize_CIs(effect_12h_protegens, heat_trtmt = 12),
get_effsize_CIs(effect_24h_protegens, heat_trtmt = 24))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
comp_protegens$Trtmt_Day <- factor(comp_protegens$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(comp_protegens$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
ggplot(comp_protegens,
aes(x = effect_est, y = as.factor(Heat), colour = Trtmt_Day, shape = as.logical(protegens))) +
#facet_grid(. ~ protegens) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = effect_loCI, xmax = effect_hiCI), height = 0.1) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Complementarity Effect",
y = "Heat duration",
shape = "protegens\npresent?")
# in this case we cannot average across the effect of protegens because it goes in the opposite direction!!
posthoc_6h <- emmeans(effect_6h_protegens,
pairwise ~ Trtmt_Day:protegens,
data = biodiv_6h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_12h <- emmeans(effect_12h_protegens,
pairwise ~ Trtmt_Day:protegens,
data = biodiv_12h)
## NOTE: Results may be misleading due to involvement in interactions
posthoc_24h <- emmeans(effect_24h_protegens,
pairwise ~ Trtmt_Day:protegens,
data = biodiv_24h)
## NOTE: Results may be misleading due to involvement in interactions
# a function that extracts the confidence intervals from a post-hoc object
get_posthoc <- function(posthoc_object, heat_trtmt) {
output <- multcomp::cld(posthoc_object, alpha=0.05/3, Letters = letters) %>%
data.frame() %>%
select(-df)
colnames(output)[3:7] <- c("est", "SE", "loCI", "hiCI", "groups")
output$Heat <- heat_trtmt
return(output)
}
# create a data.frame for plotting marginal effect sizes using a forest plot with the group labels
comp_effects <- data.frame()
comp_effects <- rbind(comp_effects,
get_posthoc(posthoc_6h, heat_trtmt = 6),
get_posthoc(posthoc_12h, heat_trtmt = 12),
get_posthoc(posthoc_24h, heat_trtmt = 24))
# re-order the levels of Trtmt_Day to go from resistance to recovery then rename them for nice plotting
comp_effects$Trtmt_Day <- factor(comp_effects$Trtmt_Day,
levels = c("recov_2", "recov_1", "resist"))
levels(comp_effects$Trtmt_Day) <- c("Recovery (Day 2)", "Recovery (Day 1)", "Resistance")
# plot
ggplot(comp_effects,
aes(x = est, y = as.factor(Heat), colour = Trtmt_Day, shape = as.logical(protegens))) +
#facet_grid(. ~ protegens) +
geom_vline(xintercept = 0, colour="darkgrey") +
geom_point(position = position_dodge(width = 0.5)) +
geom_errorbarh(position = position_dodge(width = 0.5),
aes(xmin = loCI, xmax = hiCI), height = 0.1) +
geom_text(position = position_dodge(width = 0.5),
aes(x=-2, label=groups)) +
scale_colour_manual(values=trtmt_pal) +
labs(x = "Effect Size on Complementarity Effect",
y = "Heat duration",
shape = "protegens\npresent?")
# the sample sizes are fixed to the number of non-monoculture communities (which is 5 for no protegens & 6 for yes protegens. Here I'm just using 5 because lazy)
comp_effects$n <- 5
# estimate the SD from the SE
comp_effects <- comp_effects %>% mutate(SD = SE * sqrt(n)) %>%
# re-order by Heat and Trtmt_Day
arrange(Heat, protegens, Trtmt_Day)
# all pairwise combinations of comparisons between the same treatment day for different durations
temp <- t(combn(c(1,7,13), 2))
combos <- rbind(temp, temp+1, temp+2, temp+3, temp+4, temp+5)
rm(temp)
# loop through all the combinations and do the t-tests
comp_effects_ttests <- data.frame()
for(i in 1:nrow(combos)){
comp_effects_ttests <- rbind(comp_effects_ttests,
run_ttest(row_x = combos[i,1],
row_y = combos[i,2],
summary_stats_df = comp_effects))
}
comp_effects_ttests$adjusted_p <- p.adjust(comp_effects_ttests$pvalue, method = "bonferroni")
comp_effects_ttests$Trtmt_Day <- comp_effects$Trtmt_Day[combos[,1]]
comp_effects_ttests$protegens <- comp_effects$protegens[combos[,1]]
comp_effects_ttests$Heat_1 <- comp_effects$Heat[combos[,1]]
comp_effects_ttests$Heat_2 <- comp_effects$Heat[combos[,2]]
print(comp_effects_ttests)
## t_statistic df pvalue adjusted_p Trtmt_Day protegens
## t 6.33317947 7.986335 0.0002262566 0.004072619 Recovery (Day 2) 0
## t1 1.99368966 7.998853 0.0813122316 1.000000000 Recovery (Day 2) 0
## t2 -4.34419946 7.993089 0.0024694648 0.044450366 Recovery (Day 2) 0
## t3 -1.72092013 7.995322 0.1235950734 1.000000000 Recovery (Day 1) 0
## t4 -6.47394668 7.999787 0.0001933705 0.003480669 Recovery (Day 1) 0
## t5 -4.82672625 7.997103 0.0013114255 0.023605660 Recovery (Day 1) 0
## t6 1.66035680 7.972021 0.1355551483 1.000000000 Resistance 0
## t7 0.39012867 6.758358 0.7084442781 1.000000000 Resistance 0
## t8 -0.88800348 6.995170 0.4040371786 1.000000000 Resistance 0
## t9 4.29687456 7.999998 0.0026268318 0.047282972 Recovery (Day 2) 1
## t10 1.19966963 7.999998 0.2645884542 1.000000000 Recovery (Day 2) 1
## t11 -3.09784557 8.000000 0.0147136790 0.264846222 Recovery (Day 2) 1
## t12 2.41233771 7.999829 0.0423540573 0.762373032 Recovery (Day 1) 1
## t13 3.40118522 7.999801 0.0093444261 0.168199669 Recovery (Day 1) 1
## t14 0.99070610 7.999999 0.3508439079 1.000000000 Recovery (Day 1) 1
## t15 -0.91891901 7.999918 0.3850067680 1.000000000 Resistance 1
## t16 -0.72036249 7.523464 0.4930714426 1.000000000 Resistance 1
## t17 0.07574208 7.534118 0.9415966481 1.000000000 Resistance 1
## Heat_1 Heat_2
## t 6 12
## t1 6 24
## t2 12 24
## t3 6 12
## t4 6 24
## t5 12 24
## t6 6 12
## t7 6 24
## t8 12 24
## t9 6 12
## t10 6 24
## t11 12 24
## t12 6 12
## t13 6 24
## t14 12 24
## t15 6 12
## t16 6 24
## t17 12 24
Question: what happens to selection and complementarity when monocultures are 0? (esp. when co-cultures are non-zero?)
This is what AI told me: If the monoculture yield for certain treatments is zero, it can lead to undefined or misleading results when calculating selection effects since it typically involves comparing the mean of monoculture yields to the mean of polyculture yields. Specifically, dividing by a zero value can lead to issues such as infinity or undefined ratios.
Similar to selection effects, zero monoculture values can complicate the calculation of complementarity effects. If any component of your mixture yields positively, the complementarity effect calculation (which often involves ratios or differences compared to monoculture yields) may not provide meaningful insights or could become undefined.
…One possibility it suggested was adding a very small constant…
Another possibility it suggested was to use a different index. When I asked it for some examples it suggested the following (which may or may not be real):
Loreau-Hector Decomposition: The original diversity effect decomposition can be adapted. While they proposed partitioning total diversity effects into selection and complementarity, different statistical techniques (like rarefaction) help mitigate the influence of zeros in the calculations.
Hill Numbers (Némean diversity): Hill Numbers can be employed to quantify the diversity of an assemblage, where they incorporate species abundance while being robust to zeros. They offer various orders, capturing different facets of diversity, which can reflect complementarity effects without being overly sensitive to rare or absent species.
Jaccard Index (and modified versions): The Jaccard index computes the ratio of shared to total species. Modifications can allow handling of sites with zero counts while still providing insight into complementarity and selection. (Similar to the Jaccard Index, the Sørensen-Dice Coefficient focuses on shared elements and can be particularly useful with sparse data.)
Cohen’s Kappa: While traditionally used to evaluate agreement between two raters or classification systems, it can be adapted to assess selection effects in categorical data, providing insight into complementarity or overlap without being overly sensitive to zeros.
Bray-Curtis Dissimilarity: This index measures the dissimilarity between two samples based on counts of species or occurrences. It’s not sensitive to zeros because it uses proportional data rather than raw counts, mitigating the impact of zero values.
ggplot(absDensity %>% filter(Heat == 0),
aes(x=Day, y=Diversity, colour=as.factor(CommRich))) +
facet_grid(~protegens) +
geom_beeswarm(alpha=0.5) +
scale_colour_viridis_d(option = "viridis") +
labs(colour="Inoculated\nCommunity\nRichness")
## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(test.df,
aes(y=HillDiv_q1, x=Day, colour=as.factor(CommRich))) +
facet_grid(protegens~as.factor(Heat)) +
geom_point(alpha=0.5) +
scale_colour_viridis_d(option = "viridis")
## Warning: Removed 56 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot(absDensity %>% filter(Heat == 0, CommRich>1),
aes(x=Day, y=Diversity, colour=as.factor(CommRich))) +
facet_grid(~protegens) +
geom_beeswarm(alpha=0.5) +
scale_colour_viridis_d(option = "viridis", begin=0.1) +
scale_x_continuous(breaks=0:5) +
labs(colour="Inoculated\nCommunity\nRichness",
y="Shannon Diversity")
rm(absDen_48h, biodiv_6h, biodiv_12h, biodiv_24h, biodiversityEffects_df, combos, community_extinction_probs,
comp_6h_0, comp_6h_1, comp_6h_2, comp_12h_0, comp_12h_1, comp_12h_2, comp_24h_0, comp_24h_1, comp_24h_2, comp_control0, comp_control1, comp_control2, comp_control3,
comp_effects, comp_effects_ttests, comp_protegens, decoupling_productivity, early_decoupling, late_decoupling, NoExtinct_early_decoupling, NoExtinct_late_decoupling,
divers6h_H3, divers12h_H3, divers24h_H3, divers48h_H3, effect_6h_protegens, effect_12h_protegens, effect_24h_protegens, effect_48h,
posthoc_6h, posthoc_12h, posthoc_24h, posthoc_48h, productivity_effects, productivity48h_H0, productivitySubsettedData, richness_forFit,
sel_6h_0, sel_6h_1, sel_6h_2, sel_12h_0, sel_12h_1, sel_12h_2, sel_24h_0, sel_24h_1, sel_24h_2, sel_effects, sel_effects_ttests, sel_protegens, selection_control0, selection_control1, selection_control2, selection_control3, test.df)
## Warning in rm(absDen_48h, biodiv_6h, biodiv_12h, biodiv_24h,
## biodiversityEffects_df, : object 'community_extinction_probs' not found
After backing up a bit and thinking about what the main story of the paper could be, I think the main message that I would like to tell with the paper is that heat duration has a threshold effect. So while shorter and intermediate heat durations have some effect during heat that is different from control, communities return to a similar state after recovery. On the other hand, long duration heat events lead to extinction (i.e., either of the entire community or of vulnerable species within the community) so the communities cannot recover anymore. In other words, there’s a threshold effect where the amount of heat (or bacterial) induced killing has gone on for so long than it passes a critical point and the communities recover to a different state. I don’t want to use the term “tipping point” but for sure the design of our experiment allows us to use phrases like “threshold effect” and “critical transition” sensu stricto (e.g., as explained in Munson et al., 2018).
I think it would be fantastic if I could produce a figure that summarizes the entire data in a way that builds an argument for the quintessential ball-landscape schematic that people keep showing when they talk about ecosystem stability to perturbation (e.g., see schematic in Fig. 2 of Shade et al., 2012 or the empirical figure in Fig. 3 of Jurburg et al., 2017). (Initially I thought that I would have a mixture of quantitative and qualitative data so I thought that I should use principal coordinate analysis (PCoA) and also try non-metric multidimensional scaling (NMDS) to make sure that both approaches give consistent results. Then I talked to Nico and he pointed out that my data is numeric – not counts! – so I should use PCA… Anyway, in practice for community ecology, the PCA uses euclidean distances and PCoA uses a dissimilarity index. I can literally try out all 3 and see what happens.)
Here are some tutorials on ordination: https://eddatascienceees.github.io/tutorial-rayrr13/ https://ourcodingclub.github.io/tutorials/ordination/ https://uw.pressbooks.pub/appliedmultivariatestatistics/chapter/anosim/ https://uw.pressbooks.pub/appliedmultivariatestatistics/chapter/visualizing-and-interpreting-ordinations/
If we just follow the example tutorials directly, with columns = four species and rows = different communities on different days for different heat treatments, then the data simply gets split up by species.
To calculate the Bray-Curtis dissimilarity, we are forced to choose how to deal with NA values (most of which are found in the resistance time points and so it doesn’t really make sense to outright drop them). NA values exist for two reasons:
“true” missing data where the well was not acquired at all due to technical difficulties/mistakes (only a few of this type). I used interpolation to deal with these: use the median value from other replicates at that same community:day:treatment.
below threshold of detection missing data where the total density was too low to reliably estimate the cell counts. I replace the NA with the limit of detection of the cytometer (“epsilon”, as above) and assume equal frequencies of the species that were inoculated in that community.
# go back to the complete data that includes NA values for all 4 species on some days
absDen_forOrd <- absDen_forFit %>% select(-Total_density, -Diversity, -HillDiv_q1, -TotDensity_scale)
# NA values with Total_density == NA are "true" missing data where I failed to record the flow cytometry measurements on that day due to technical difficulties/mistakes. These can be interpolated by using the median values from the remaining community replicates
## get the median values for all communities, days, and heat treatments
median_vals <- absDen_forOrd %>% group_by(Heat, Day, community) %>%
summarise(Med_putida = median(Conc_putida, na.rm=TRUE),
Med_protegens = median(Conc_protegens, na.rm=TRUE),
Med_grimontii = median(Conc_grimontii, na.rm=TRUE),
Med_veronii = median(Conc_veronii, na.rm=TRUE))
## `summarise()` has grouped output by 'Heat', 'Day'. You can override using the
## `.groups` argument.
## get the index for the rows with "true" missing values
missing_rows <- which(is.na(absDen_forOrd$TotDen_plusEpsilon))
## loop through the missing values
for(i in missing_rows){
# find the interpolation value in the table of median values
temp_med_val <- median_vals[median_vals$Heat == absDen_forOrd$Heat[i] &
median_vals$Day == absDen_forOrd$Day[i] &
median_vals$community == absDen_forOrd$community[i],]
# replace the NA values with the median values
absDen_forOrd$Conc_putida[i] <- temp_med_val$Med_putida
absDen_forOrd$Conc_protegens[i] <- temp_med_val$Med_protegens
absDen_forOrd$Conc_grimontii[i] <- temp_med_val$Med_grimontii
absDen_forOrd$Conc_veronii[i] <- temp_med_val$Med_veronii
# clean up
rm(temp_med_val)
}
# clean up
rm(median_vals, missing_rows)
# on the other hand, NA values where Total_density is epsilon represent flow cytometry counts that were below the threshold of detection. In this case let's assume 1:1 ratios of inoculated strains at a total density equal to epsilon.
epsilon <- min(absDen_forOrd$TotDen_plusEpsilon, na.rm=TRUE)
## get the index for the missing value rows below the threshold of detection
missing_rows <- which(is.na(absDen_forOrd$Conc_putida))
## CommRich NA values were supposed to indicate some differences but that doesn't really matter for us anymore
absDen_forOrd$CommRich <- absDen_forOrd$putida + absDen_forOrd$protegens + absDen_forOrd$grimontii + absDen_forOrd$veronii
for(i in missing_rows){
# replace the NA values with epsilon divided by the inoculated species richness
absDen_forOrd$Conc_putida[i] <- absDen_forOrd$putida[i] * epsilon / absDen_forOrd$CommRich[i]
absDen_forOrd$Conc_protegens[i] <- absDen_forOrd$protegens[i] * epsilon / absDen_forOrd$CommRich[i]
absDen_forOrd$Conc_grimontii[i] <- absDen_forOrd$grimontii[i] * epsilon / absDen_forOrd$CommRich[i]
absDen_forOrd$Conc_veronii[i] <- absDen_forOrd$veronii[i] * epsilon / absDen_forOrd$CommRich[i]
}
# re-order the levels of Heat for better plotting
absDen_forOrd$Heat <- factor(absDen_forOrd$Heat, levels=c("control", "6", "12", "24", "48"))
# finally we can drop the TotDen_plusEpsilon column
absDen_forOrd <- absDen_forOrd %>% select(-TotDen_plusEpsilon)
rm(epsilon, missing_rows)
# now we can do the ordination
# keep just the 4 species abundances
abundance_matrix <- as.matrix(absDen_forOrd[,12:15])
# re-name the columns for better plotting
colnames(abundance_matrix) <- c("Pu", "Pt", "Gi", "Vn")
try.PCA <- rda(abundance_matrix, scale=FALSE) # function from vegan package
## all species are measured on the same scale (flow cytometry counts) so use scale = FALSE
# a bar plot of relative eigenvalues --> the percentage variance explained by each axis
barplot(as.vector(try.PCA$CA$eig)/sum(try.PCA$CA$eig), xlab="PCA ordination axis", ylab="% of variance explained")
## 83% of variance explained by top 2 axes
sum((as.vector(try.PCA$CA$eig)/sum(try.PCA$CA$eig))[1:2])
## [1] 0.8310123
## and 97% explained by the 3. As we only have 4 species, we should hope this would be the case XD
sum((as.vector(try.PCA$CA$eig)/sum(try.PCA$CA$eig))[1:3])
## [1] 0.9732756
# show results as a biplot
biplot(try.PCA, choices = c(1,2), type = c("text", "points"))
biplot(try.PCA, choices = c(1,3), type = c("text", "points"))
The PCA is telling us that most of the variance in the data is explained by the abundances of putida vs protegens (axis 1), with residual variance explained by veronii abundance (axis 2), and the remaining variance explained by grimontii (axis 3). This result is rather trivial. But the problem is not just the PCA method itself because NMDS is also giving us a similar result:
# it's not possible to use bray-curtis dissimilarity as there are communities with 0 for all species:
try.NMDS <- metaMDS(abundance_matrix, distance = "bray", k = 2, autotransform = TRUE, trymax=100)
## if we look at the source code for vegdist (the function that generated the error message),
vegdist
## we see that 5 other metrics do support communities with all 0 values: manhattan, euclidean, gower, mahalanobis, and chisq
# let's try again arbitrarily choosing manhattan
try.NMDS <- metaMDS(abundance_matrix, distance = "manhattan", k = 2, autotransform = TRUE, trymax=100)
# this takes a while to run so we will simply save its result
save(try.NMDS, file="naive_nmds.RData")
load("naive_nmds.RData")
# check the stress value. It should be < 0.2, ideally even < 0.05. (Too low stress values can indicate too many 0 values)
try.NMDS$stress # looks good
## [1] 0.06423967
# plot the results
plot(try.NMDS)
ordiplot(try.NMDS, type = "n") # create blank ordination plot
orditorp(try.NMDS, display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, display = "species", col="red", air = 0.1) # add species names in red
# let's look more closely at this plot because we know there is a strong effect of protegens and a weaker effect of heat...
# define a function (related to vegan) that finds coordinates for drawing a covariance ellipse
veganCovEllipse <- function (cov, center = c(0, 0), scale = 1, npoints = 100) {
theta <- (0:npoints) * 2 * pi/npoints
Circle <- cbind(cos(theta), sin(theta))
t(center + scale * t(Circle %*% chol(cov)))
# finds the centroids and dispersion of the different ellipses based on a grouping factor of your choice
}
# Let's switch over to ggplot for displaying Heat by colour and protegens presence/absence by linetype
nmds_for_ggplot <- cbind(absDen_forOrd[,1:11],
as.data.frame(scores(try.NMDS)$sites))
# create a new factor that defines the combination of heat and protegens
nmds_for_ggplot <- nmds_for_ggplot %>% unite("HeatxProtegensxDay", c(Heat, protegens, Day), remove = FALSE)
nmds_for_ggplot$HeatxProtegensxDay <- factor(nmds_for_ggplot$HeatxProtegensxDay)
# create empty dataframe to combine NMDS data with ellipse data
#ellipse12_df <- data.frame()
# adding data for ellipses, using HeatxProtegensxDay as the grouping factor
#for(g in levels(nmds_for_ggplot$HeatxProtegensxDay)){
# ellipse12_df <- rbind(ellipse12_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegensxDay==g,],
# veganCovEllipse(cov.wt(cbind(NMDS1, NMDS2),
# wt=rep(1/length(NMDS1),length(NMDS1)))$cov,
# center=c(mean(NMDS1),mean(NMDS2)))))
# , HeatxProtegensxDay=g))
#}
# This is NOT working because of some issue when drawing the covariance ellipse. It must be the same issue as the error that gg_ordiplot gives below: "Warning: the matrix is either rank-deficient or not positive definite"
gg_ordiplot(try.NMDS, groups = nmds_for_ggplot$HeatxProtegensxDay, plot = TRUE)
## Warning in chol.default(cov, pivot = TRUE): the matrix is either rank-deficient
## or not positive definite
## Warning in chol.default(cov, pivot = TRUE): the matrix is either rank-deficient
## or not positive definite
## Warning in chol.default(cov, pivot = TRUE): the matrix is either rank-deficient
## or not positive definite
## Warning in chol.default(cov, pivot = TRUE): the matrix is either rank-deficient
## or not positive definite
test <- absDen_forOrd
test$Heat <- as.character(levels(test$Heat))[test$Heat]
test$Heat[test$Heat == "control"] <- 0
test$Heat <- as.numeric(test$Heat)
# so I can draw the ellipses with the functions from ggordiplots but this package is not sufficiently flexible to allow clear plotting of the interaction between protegens x Heat... (nevermind the 3-way interaction of time as well)
gg_ordiplot(try.NMDS, groups = paste(absDen_forOrd$Day, absDen_forOrd$protegens), plot = TRUE)
gg_envfit(try.NMDS, env = test$Heat, groups = paste(absDen_forOrd$Heat, absDen_forOrd$Day, absDen_forOrd$protegens), plot = TRUE)
## [1] "No variable significant at alpha <= 0.05"
# clean up
rm(abundance_matrix, try.PCA, try.NMDS, nmds_for_ggplot, test)
Above I am using the data in a rather naive way by giving the abundance data for each of the 4 species (i.e., the columns in the abundance matrix) and different sites (aka rows in the abundance matrix) are the different communities over time and with different heat durations. It would be ideal to do the ordination this way because then we could see if the communities exposed to 6-24h of heat recover in the same way (but 48h recovers in a different way). The problem is that this creates rows in the abundance matrix that are all 0’s (i.e., when extinction happens). The bray-curtis dissimilarity metric does not allow extinct sites and, even though there are some other metrics that do (e.g., manhattan is used above) … I still have problems with this because the covariance ellipses cannot be drawn since the abundance matrix is “either rank-deficient or not positive definite” :/ Maybe I will try out the different metrics below considering species other than protegens…
We want to understand how the communities are changing over time so let’s give it the data as species x time. This can be achieved by widening the data so that we have abundances of the 4 species during resistance, during early recovery, and during late recovery.
Note that I also had to keep just 3 time points from the control treatment. I chose to keep day 1 (coded as “resistance”), day 3 (coded as “early recovery”), and day 5 (coded as “late recovery”) because this way the ordination plot will show the control treatment early, middle, and late in the time series…
While all of this is a slight improvement as compared to above, it doesn’t show exactly the summary picture that I was hoping for :(
# first we have to widen the data:
# create a column that indicates the treatment day as resistance, early recovery, or late recovery
absDen_forOrd$trtmt_day <- "resist"
absDen_forOrd$trtmt_day[absDen_forOrd$Recov_Day == 1] <- "early_recov"
absDen_forOrd$trtmt_day[absDen_forOrd$Recov_Day == 2] <- "late_recov"
# ENTIRELY ARBITARARILY: I will keep days 1, 3, and 5 for control
absDen_forOrd$trtmt_day[absDen_forOrd$Heat == "control" & absDen_forOrd$Day == 3] <- "early_recov"
absDen_forOrd$trtmt_day[absDen_forOrd$Heat == "control" & absDen_forOrd$Day == 5] <- "late_recov"
# remove day 1 for 12h, 24h, 48h AND day 2 for 48h.
absDen_forOrd <- absDen_forOrd[!(absDen_forOrd$Heat == 12 & absDen_forOrd$Day == 1), ]
absDen_forOrd <- absDen_forOrd[!(absDen_forOrd$Heat == 24 & absDen_forOrd$Day == 1), ]
absDen_forOrd <- absDen_forOrd[!(absDen_forOrd$Heat == 48 & absDen_forOrd$Day == 1), ]
absDen_forOrd <- absDen_forOrd[!(absDen_forOrd$Heat == 48 & absDen_forOrd$Day == 2), ]
# also remove day 2 and day 4 for control.
absDen_forOrd <- absDen_forOrd[!(absDen_forOrd$Heat == "control" & absDen_forOrd$Day == 2), ]
absDen_forOrd <- absDen_forOrd[!(absDen_forOrd$Heat == "control" & absDen_forOrd$Day == 4), ]
# pivot wider to create a column for each of the 4 species on each of the 3 days
absDen_wide_forOrd <- absDen_forOrd %>% select(-Day, -Heat_Day, -Recov_Day) %>%
pivot_wider(names_from = trtmt_day,
values_from = c(Conc_putida, Conc_protegens, Conc_grimontii, Conc_veronii))
# re-name the species abundance over time columns so they are shorter (again for better plotting)
colnames(absDen_wide_forOrd)[9:20] <- c("Pu_Resist", "Pu_earlyR", "Pu_lateR",
"Pt_Resist", "Pt_earlyR", "Pt_lateR",
"Gi_Resist", "Gi_earlyR", "Gi_lateR",
"Vn_Resist", "Vn_earlyR", "Vn_lateR")
###############
# OKKKKAYYY! now we can try out the PCA:
###############
# keep just the species abundances
abundance_matrix <- as.matrix(absDen_wide_forOrd[,9:20])
try.PCA <- rda(abundance_matrix, scale=FALSE)
# a bar plot of relative eigenvalues --> the percentage variance explained by each axis
barplot(as.vector(try.PCA$CA$eig)/sum(try.PCA$CA$eig), xlab="PCA ordination axis", ylab="% of variance explained")
## 63% of variance explained by top 2 axes
sum((as.vector(try.PCA$CA$eig)/sum(try.PCA$CA$eig))[1:2])
## [1] 0.5953685
## and 76% explained by the top 3...
sum((as.vector(try.PCA$CA$eig)/sum(try.PCA$CA$eig))[1:3])
## [1] 0.7537365
# show results as a biplot
biplot(try.PCA, choices = c(1,2), type = c("text", "points"))
biplot(try.PCA, choices = c(1,3), type = c("text", "points"))
# okay, so again it seems to be mostly separating things by species...
# this is probably why Nico warned me that I will likely need to separate the communities out by species anyway
# visualize using ellipse plot
ordiplot(try.PCA) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.PCA, absDen_wide_forOrd$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120) # adding ellipses to the plot, grouping by distance (inverts$Distance)
legend("topright", title="Heat",
levels(absDen_wide_forOrd$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9) # adding a legend
# other axes
ordiplot(try.PCA, choices = c(1,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.PCA, choices = c(1,3), absDen_wide_forOrd$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120) # adding ellipses to the plot, grouping by distance (inverts$Distance)
legend("topright", title="Heat",
levels(absDen_wide_forOrd$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9) # adding a legend
###############
# FINE THEN! we'll try a PCoA:
###############
# let's use the Bray-Curtis dissimilarity as this is commonly used for abundance data
dist <- vegdist(abundance_matrix, method = "bray", na.rm=TRUE)
# run PCoA from ape package
try.PCoA <- pcoa(dist)
# a bar plot of relative eigenvalues --> the percentage variance explained by each axis
barplot(try.PCoA$values$Relative_eig[1:10], xlab="PCoA ordination axis", ylab="% of variance explained")
# the first 2 axes only explain 59% of the variance
sum(try.PCoA$values$Relative_eig[1:2])
## [1] 0.5878416
biplot.pcoa(try.PCoA, abundance_matrix)
# this only looks slightly better and I'm not sure how to plot things from ape...
###############
# ALRIGHT!!! we'll do NMDS again
###############
# a function to automatically run the NMDS for k = 1 to 10 so we can choose appropriately small number of dimensions for ordination
NMDS.scree <- function(mat) { #where x is the abundance matrix
data.frame(k = 1:10,
# autotransform the data before calculating the bray-curtis dissimilarity
stress = sapply(1:10, function(x) metaMDS(mat, distance = "bray", k = x, autotransform = TRUE)$stress))
}
scree_out <- NMDS.scree(abundance_matrix)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.2423303
## Run 1 stress 0.2573684
## Run 2 stress 0.2299449
## ... New best solution
## ... Procrustes: rmse 0.03478929 max resid 0.1805206
## Run 3 stress 0.3145343
## Run 4 stress 0.2121945
## ... New best solution
## ... Procrustes: rmse 0.02624893 max resid 0.2120008
## Run 5 stress 0.2557321
## Run 6 stress 0.2805136
## Run 7 stress 0.2753868
## Run 8 stress 0.2460553
## Run 9 stress 0.3299521
## Run 10 stress 0.2734342
## Run 11 stress 0.2986035
## Run 12 stress 0.2921969
## Run 13 stress 0.2637382
## Run 14 stress 0.3070432
## Run 15 stress 0.2940965
## Run 16 stress 0.296801
## Run 17 stress 0.3094877
## Run 18 stress 0.2475954
## Run 19 stress 0.260524
## Run 20 stress 0.2716624
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.09939261
## Run 1 stress 0.1657344
## Run 2 stress 0.09994962
## Run 3 stress 0.1282733
## Run 4 stress 0.1247214
## Run 5 stress 0.1185939
## Run 6 stress 0.1255302
## Run 7 stress 0.1404017
## Run 8 stress 0.1457812
## Run 9 stress 0.1435366
## Run 10 stress 0.1652571
## Run 11 stress 0.1276692
## Run 12 stress 0.1281597
## Run 13 stress 0.1562076
## Run 14 stress 0.1338836
## Run 15 stress 0.1317635
## Run 16 stress 0.09979226
## ... Procrustes: rmse 0.00531726 max resid 0.08940772
## Run 17 stress 0.1652032
## Run 18 stress 0.1120754
## Run 19 stress 0.09994911
## Run 20 stress 0.1508835
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 12: stress ratio > sratmax
## 8: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.05653678
## Run 1 stress 0.05825034
## Run 2 stress 0.05723117
## Run 3 stress 0.05828899
## Run 4 stress 0.05680361
## ... Procrustes: rmse 0.005009613 max resid 0.03386021
## Run 5 stress 0.05777861
## Run 6 stress 0.05811525
## Run 7 stress 0.05861438
## Run 8 stress 0.05817915
## Run 9 stress 0.05653696
## ... Procrustes: rmse 0.0001928783 max resid 0.001106971
## ... Similar to previous best
## Run 10 stress 0.05698756
## ... Procrustes: rmse 0.01559344 max resid 0.04877927
## Run 11 stress 0.05656275
## ... Procrustes: rmse 0.0009434087 max resid 0.01368856
## Run 12 stress 0.0580746
## Run 13 stress 0.05811392
## Run 14 stress 0.05742608
## Run 15 stress 0.05829365
## Run 16 stress 0.05691023
## ... Procrustes: rmse 0.004597308 max resid 0.03256168
## Run 17 stress 0.05876642
## Run 18 stress 0.05811378
## Run 19 stress 0.05715715
## Run 20 stress 0.05826057
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.03100195
## Run 1 stress 0.03106565
## ... Procrustes: rmse 0.002425929 max resid 0.01959916
## Run 2 stress 0.03129899
## ... Procrustes: rmse 0.0078594 max resid 0.04894639
## Run 3 stress 0.0380986
## Run 4 stress 0.03797448
## Run 5 stress 0.03155515
## Run 6 stress 0.03099996
## ... New best solution
## ... Procrustes: rmse 0.001591898 max resid 0.01490867
## Run 7 stress 0.03141883
## ... Procrustes: rmse 0.0183142 max resid 0.04905475
## Run 8 stress 0.03101012
## ... Procrustes: rmse 0.00113465 max resid 0.01845132
## Run 9 stress 0.0315398
## Run 10 stress 0.03136505
## ... Procrustes: rmse 0.01835451 max resid 0.0496021
## Run 11 stress 0.0397338
## Run 12 stress 0.03102083
## ... Procrustes: rmse 0.001234658 max resid 0.01496427
## Run 13 stress 0.03134953
## ... Procrustes: rmse 0.01832639 max resid 0.04961873
## Run 14 stress 0.04284539
## Run 15 stress 0.03145585
## ... Procrustes: rmse 0.01821828 max resid 0.05318992
## Run 16 stress 0.03138609
## ... Procrustes: rmse 0.01833961 max resid 0.04946338
## Run 17 stress 0.03102063
## ... Procrustes: rmse 0.0007388695 max resid 0.002399211
## ... Similar to previous best
## Run 18 stress 0.0313503
## ... Procrustes: rmse 0.01832744 max resid 0.04988783
## Run 19 stress 0.03108694
## ... Procrustes: rmse 0.002120898 max resid 0.01942933
## Run 20 stress 0.03189862
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.02267932
## Run 1 stress 0.02344693
## Run 2 stress 0.02317751
## ... Procrustes: rmse 0.005666463 max resid 0.02512357
## Run 3 stress 0.02284494
## ... Procrustes: rmse 0.006395559 max resid 0.04675857
## Run 4 stress 0.02312919
## ... Procrustes: rmse 0.01542964 max resid 0.06192255
## Run 5 stress 0.02283819
## ... Procrustes: rmse 0.007643473 max resid 0.05459797
## Run 6 stress 0.02290202
## ... Procrustes: rmse 0.006652446 max resid 0.03844402
## Run 7 stress 0.0234667
## Run 8 stress 0.02277291
## ... Procrustes: rmse 0.00358337 max resid 0.0242863
## Run 9 stress 0.02339156
## Run 10 stress 0.02276672
## ... Procrustes: rmse 0.006668918 max resid 0.05124861
## Run 11 stress 0.02267772
## ... New best solution
## ... Procrustes: rmse 0.00623331 max resid 0.0527834
## Run 12 stress 0.02303266
## ... Procrustes: rmse 0.008864597 max resid 0.05398072
## Run 13 stress 0.02316837
## ... Procrustes: rmse 0.009244931 max resid 0.05293383
## Run 14 stress 0.02280645
## ... Procrustes: rmse 0.006540033 max resid 0.04957087
## Run 15 stress 0.02310953
## ... Procrustes: rmse 0.01557417 max resid 0.06697367
## Run 16 stress 0.02267285
## ... New best solution
## ... Procrustes: rmse 0.00556759 max resid 0.05179907
## Run 17 stress 0.02278436
## ... Procrustes: rmse 0.005278804 max resid 0.02975771
## Run 18 stress 0.02341959
## Run 19 stress 0.02343765
## Run 20 stress 0.02283562
## ... Procrustes: rmse 0.005049789 max resid 0.04753646
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 11: no. of iterations >= maxit
## 9: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01745044
## Run 1 stress 0.01802168
## Run 2 stress 0.0173793
## ... New best solution
## ... Procrustes: rmse 0.004972134 max resid 0.02317937
## Run 3 stress 0.01742249
## ... Procrustes: rmse 0.00341688 max resid 0.02218389
## Run 4 stress 0.01779049
## ... Procrustes: rmse 0.01155829 max resid 0.06022024
## Run 5 stress 0.01766305
## ... Procrustes: rmse 0.01092733 max resid 0.0618713
## Run 6 stress 0.01792208
## Run 7 stress 0.01789147
## Run 8 stress 0.01763231
## ... Procrustes: rmse 0.01074515 max resid 0.06267501
## Run 9 stress 0.01823203
## Run 10 stress 0.01805406
## Run 11 stress 0.01802798
## Run 12 stress 0.01810894
## Run 13 stress 0.01776419
## ... Procrustes: rmse 0.01067883 max resid 0.0594182
## Run 14 stress 0.01769516
## ... Procrustes: rmse 0.005147369 max resid 0.02383598
## Run 15 stress 0.01809825
## Run 16 stress 0.01788066
## Run 17 stress 0.01822323
## Run 18 stress 0.01747932
## ... Procrustes: rmse 0.003024579 max resid 0.03881727
## Run 19 stress 0.01775182
## ... Procrustes: rmse 0.01145934 max resid 0.06264784
## Run 20 stress 0.01805308
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01470232
## Run 1 stress 0.01430386
## ... New best solution
## ... Procrustes: rmse 0.0149856 max resid 0.07108635
## Run 2 stress 0.01469431
## ... Procrustes: rmse 0.009135797 max resid 0.05724204
## Run 3 stress 0.01521434
## Run 4 stress 0.01417445
## ... New best solution
## ... Procrustes: rmse 0.0119612 max resid 0.07012441
## Run 5 stress 0.01417907
## ... Procrustes: rmse 0.007013916 max resid 0.07961698
## Run 6 stress 0.01422718
## ... Procrustes: rmse 0.01188127 max resid 0.07079991
## Run 7 stress 0.01498501
## Run 8 stress 0.01509119
## Run 9 stress 0.01457585
## ... Procrustes: rmse 0.01412902 max resid 0.05895841
## Run 10 stress 0.01477267
## Run 11 stress 0.01498094
## Run 12 stress 0.01425943
## ... Procrustes: rmse 0.006198458 max resid 0.03297611
## Run 13 stress 0.01428775
## ... Procrustes: rmse 0.01250676 max resid 0.06740316
## Run 14 stress 0.01465784
## ... Procrustes: rmse 0.009443754 max resid 0.06539987
## Run 15 stress 0.01474276
## Run 16 stress 0.01413957
## ... New best solution
## ... Procrustes: rmse 0.003192464 max resid 0.02167267
## Run 17 stress 0.01446733
## ... Procrustes: rmse 0.01220647 max resid 0.06837253
## Run 18 stress 0.01447926
## ... Procrustes: rmse 0.01449191 max resid 0.05312246
## Run 19 stress 0.01426968
## ... Procrustes: rmse 0.01169014 max resid 0.06997595
## Run 20 stress 0.01435296
## ... Procrustes: rmse 0.01418683 max resid 0.0559407
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01159271
## Run 1 stress 0.01220764
## Run 2 stress 0.01180629
## ... Procrustes: rmse 0.01204254 max resid 0.06653301
## Run 3 stress 0.0124686
## Run 4 stress 0.01239596
## Run 5 stress 0.0119366
## ... Procrustes: rmse 0.01328591 max resid 0.08070365
## Run 6 stress 0.01170678
## ... Procrustes: rmse 0.01124675 max resid 0.06356086
## Run 7 stress 0.01238142
## Run 8 stress 0.01242123
## Run 9 stress 0.01215995
## Run 10 stress 0.0134269
## Run 11 stress 0.01225999
## Run 12 stress 0.01257243
## Run 13 stress 0.01275424
## Run 14 stress 0.01292711
## Run 15 stress 0.01246295
## Run 16 stress 0.01234462
## Run 17 stress 0.01235877
## Run 18 stress 0.01218236
## Run 19 stress 0.0118913
## ... Procrustes: rmse 0.01186868 max resid 0.07010622
## Run 20 stress 0.01238701
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01006807
## Run 1 stress 0.01066899
## Run 2 stress 0.01064967
## Run 3 stress 0.01084376
## Run 4 stress 0.01032355
## ... Procrustes: rmse 0.01072551 max resid 0.05035648
## Run 5 stress 0.01084594
## Run 6 stress 0.01095936
## Run 7 stress 0.01093792
## Run 8 stress 0.01102826
## Run 9 stress 0.01044267
## ... Procrustes: rmse 0.01242673 max resid 0.05786542
## Run 10 stress 0.01087459
## Run 11 stress 0.01049404
## ... Procrustes: rmse 0.01293741 max resid 0.0527751
## Run 12 stress 0.01107235
## Run 13 stress 0.01066664
## Run 14 stress 0.0110206
## Run 15 stress 0.01094853
## Run 16 stress 0.01047692
## ... Procrustes: rmse 0.01153046 max resid 0.04474728
## Run 17 stress 0.01082135
## Run 18 stress 0.01090824
## Run 19 stress 0.01045786
## ... Procrustes: rmse 0.01284687 max resid 0.04693845
## Run 20 stress 0.01142801
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.009111295
## Run 1 stress 0.009641409
## Run 2 stress 0.009426488
## ... Procrustes: rmse 0.01032616 max resid 0.051754
## Run 3 stress 0.009980389
## Run 4 stress 0.009889636
## Run 5 stress 0.009542563
## ... Procrustes: rmse 0.009658133 max resid 0.08202117
## Run 6 stress 0.01023379
## Run 7 stress 0.009398749
## ... Procrustes: rmse 0.01133691 max resid 0.0694371
## Run 8 stress 0.009830752
## Run 9 stress 0.009641081
## Run 10 stress 0.009910877
## Run 11 stress 0.01000141
## Run 12 stress 0.009953455
## Run 13 stress 0.009418106
## ... Procrustes: rmse 0.01116465 max resid 0.05927265
## Run 14 stress 0.009593616
## ... Procrustes: rmse 0.009609863 max resid 0.05930223
## Run 15 stress 0.009461614
## ... Procrustes: rmse 0.009441102 max resid 0.03784585
## Run 16 stress 0.009931996
## Run 17 stress 0.009640013
## Run 18 stress 0.00966013
## Run 19 stress 0.009678077
## Run 20 stress 0.009419104
## ... Procrustes: rmse 0.01172843 max resid 0.05889063
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
plot(scree_out)
# k=3 looks great
try.NMDS <- metaMDS(abundance_matrix, distance = "bray", k = 3, autotransform = TRUE, trymax=100)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.05653678
## Run 1 stress 0.05667961
## ... Procrustes: rmse 0.01513391 max resid 0.04491926
## Run 2 stress 0.05805718
## Run 3 stress 0.05828899
## Run 4 stress 0.05689464
## ... Procrustes: rmse 0.01581086 max resid 0.05128519
## Run 5 stress 0.05829245
## Run 6 stress 0.05668294
## ... Procrustes: rmse 0.0151477 max resid 0.0449724
## Run 7 stress 0.05829336
## Run 8 stress 0.05817212
## Run 9 stress 0.05829109
## Run 10 stress 0.05933728
## Run 11 stress 0.05653649
## ... New best solution
## ... Procrustes: rmse 7.823092e-05 max resid 0.001220532
## ... Similar to previous best
## Run 12 stress 0.05824236
## Run 13 stress 0.05838753
## Run 14 stress 0.05848479
## Run 15 stress 0.05752246
## Run 16 stress 0.05679815
## ... Procrustes: rmse 0.004808413 max resid 0.03246386
## Run 17 stress 0.05822332
## Run 18 stress 0.05939947
## Run 19 stress 0.05669092
## ... Procrustes: rmse 0.0151233 max resid 0.04479694
## Run 20 stress 0.05660349
## ... Procrustes: rmse 0.0008076746 max resid 0.01397756
## *** Best solution repeated 1 times
# plot the results for axis 1 & 2
ordiplot(try.NMDS, type = "n") # create blank ordination plot
orditorp(try.NMDS, display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(1,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(1,3), display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(2,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(2,3), display = "species", col="red", air = 0.1) # add species names in red
# visualize using ellipse plot for axis 1 & 2
ordiplot(try.NMDS)
ordiellipse(try.NMDS, absDen_wide_forOrd$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120) # adding ellipses to the plot, grouping by distance (inverts$Distance)
legend("topright", title="Heat",
levels(absDen_wide_forOrd$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ellipse plot for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.NMDS, choices = c(1,3), absDen_wide_forOrd$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120)
legend("topright", title="Heat",
levels(absDen_wide_forOrd$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ellipse plot for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.NMDS, choices = c(2,3), absDen_wide_forOrd$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3, lty=1,
draw = "polygon", alpha=120)
legend("topright", title="Heat",
levels(absDen_wide_forOrd$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ...there is a small trend with heat but nothing super dramatic...
# that kind of makes sense as we are interested in heat:time...
# well, we already know that presense/absence of protegens is consistently the most important thing for all communities so let's see if that shows up here.
# Let's switch over to ggplot to be certain that everything is labelled correctly.
nmds_for_ggplot <- cbind(absDen_wide_forOrd[,1:8],
as.data.frame(scores(try.NMDS)$sites))
# create a new factor that defines the combination of heat and protegens
nmds_for_ggplot <- nmds_for_ggplot %>% unite("HeatxProtegens", c(Heat, protegens), remove = FALSE)
nmds_for_ggplot$HeatxProtegens <- factor(nmds_for_ggplot$HeatxProtegens,
levels = c("6_0", "6_1", "12_0", "12_1", "24_0", "24_1", "48_0", "48_1", "control_0", "control_1"))
# create empty dataframes to combine NMDS data with ellipse data
ellipse12_df <- ellipse13_df <- ellipse23_df <- data.frame() # numbers indicate the ordination axes
# adding data for ellipses, using HeatxProtegens as a grouping factor
for(g in levels(nmds_for_ggplot$HeatxProtegens)){
ellipse12_df <- rbind(ellipse12_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegens==g,],
veganCovEllipse(cov.wt(cbind(NMDS1, NMDS2),
wt=rep(1/length(NMDS1),length(NMDS1)))$cov,
center=c(mean(NMDS1),mean(NMDS2)))))
, HeatxProtegens=g))
ellipse13_df <- rbind(ellipse13_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegens==g,],
veganCovEllipse(cov.wt(cbind(NMDS1, NMDS3),
wt=rep(1/length(NMDS1),length(NMDS1)))$cov,
center=c(mean(NMDS1),mean(NMDS3)))))
, HeatxProtegens=g))
ellipse23_df <- rbind(ellipse23_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegens==g,],
veganCovEllipse(cov.wt(cbind(NMDS2, NMDS3),
wt=rep(1/length(NMDS2),length(NMDS2)))$cov,
center=c(mean(NMDS2),mean(NMDS3)))))
, HeatxProtegens=g))
}
# now we separate the HeatxProtegens columns:
ellipse12_df <- ellipse12_df %>% separate(HeatxProtegens, c("Heat", "protegens"))
ellipse12_df$Heat <- factor(ellipse12_df$Heat, levels = levels(nmds_for_ggplot$Heat))
ellipse13_df <- ellipse13_df %>% separate(HeatxProtegens, c("Heat", "protegens"))
ellipse13_df$Heat <- factor(ellipse13_df$Heat, levels = levels(nmds_for_ggplot$Heat))
ellipse23_df <- ellipse23_df %>% separate(HeatxProtegens, c("Heat", "protegens"))
ellipse23_df$Heat <- factor(ellipse23_df$Heat, levels = levels(nmds_for_ggplot$Heat))
nmds_for_ggplot$protegens <- as.character(nmds_for_ggplot$protegens) # this needs to be discrete (could also be a factor)
# and finally we can make the plots:
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS2)) +
geom_point(aes(color = Heat, shape = protegens), alpha=0.4) + # adding different colours and shapes for points at different distances
geom_path(data=ellipse12_df, aes(x=NMDS1, y=NMDS2, colour=Heat, linetype=protegens), linewidth=1) + # adding covariance ellipses according to distance # use size argument if ggplot2 < v. 3.4.0
guides(color = guide_legend(override.aes = list(linetype=rep(NA,5)))) + # removes lines from colour part of the legend
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme + # not sure why I need this but I do to over-write the default grey theme
labs(title="NMDS of all data (4sp & 3 time-points)")
# axes 1 & 2 again showing just the ellipses (bc it's hard to see protegens effects as it's so overlapped)
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS2)) +
geom_path(data=ellipse12_df, aes(x=NMDS1, y=NMDS2, colour=Heat, linetype=protegens), linewidth=1) + # plot just the ellipses
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme + # not sure why I need this but I do to over-write the default grey theme
labs(title="NMDS of all data (4sp & 3 time-points)")
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS3)) +
geom_point(aes(color = Heat, shape = protegens), alpha=0.4) +
geom_path(data=ellipse13_df, aes(x=NMDS1, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
guides(color = guide_legend(override.aes = list(linetype=rep(NA,5)))) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS of all data (4sp & 3 time-points)")
# axes 1 & 3 again showing just the ellipses
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS3)) +
geom_path(data=ellipse13_df, aes(x=NMDS1, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS of all data (4sp & 3 time-points)")
ggplot(data = nmds_for_ggplot, aes(NMDS2, NMDS3)) +
geom_point(aes(color = Heat, shape = protegens), alpha=0.4) +
geom_path(data=ellipse23_df, aes(x=NMDS2, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
guides(color = guide_legend(override.aes = list(linetype=rep(NA,5)))) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS of all data (4sp & 3 time-points)")
# axes 2 & 3 again showing just the ellipses (bc it's hard to see protegens effects as it's so overlapped)
ggplot(data = nmds_for_ggplot, aes(NMDS2, NMDS3)) +
geom_path(data=ellipse23_df, aes(x=NMDS2, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS of all data (4sp & 3 time-points)")
################
# check significance:
# using a PERMANOVA to test the differences in community composition
# This is a PERmutational Multivariate ANalysis Of VAriance and tests the differences between groups, like an ANOVA, but with lots of variables.
# it is essentially a multivariate analysis of variance used to compare groups of objects
nmdsdata_test_Heat <- adonis2(abundance_matrix ~ Heat, absDen_wide_forOrd,
permutations = 999, method = "bray")
print(nmdsdata_test_Heat)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_matrix ~ Heat, data = absDen_wide_forOrd, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 4 8.188 0.07534 6.4981 0.001 ***
## Residual 319 100.485 0.92466
## Total 323 108.673 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
nmdsdata_test_Prot <- adonis2(abundance_matrix ~ protegens, absDen_wide_forOrd,
permutations = 999, method = "bray")
print(nmdsdata_test_Prot)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_matrix ~ protegens, data = absDen_wide_forOrd, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 1 45.211 0.41603 229.39 0.001 ***
## Residual 322 63.462 0.58397
## Total 323 108.673 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
nmdsdata_test_HeatxProt <- adonis2(abundance_matrix ~ Heat * protegens, absDen_wide_forOrd,
permutations = 999, method = "bray")
print(nmdsdata_test_HeatxProt)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_matrix ~ Heat * protegens, data = absDen_wide_forOrd, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 9 60.966 0.56101 44.586 0.001 ***
## Residual 314 47.706 0.43899
## Total 323 108.673 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# so these are all significant but is that spurious because the dispersion is different btw groups? (e.g., much smaller for protegens)
##############
# check PERMANOVA assumption of homogeneous group variances
# Bray-curtis distance matrix
dist_mat <- vegdist(abundance_matrix, method = "bray")
# use betadisper test to check for multivariate homogeneity of group variances
dispersion <- betadisper(dist_mat, group = paste(absDen_wide_forOrd$Heat, absDen_wide_forOrd$protegens))
## Warning in betadisper(dist_mat, group = paste(absDen_wide_forOrd$Heat,
## absDen_wide_forOrd$protegens)): some squared distances are negative and changed
## to zero
permutest(dispersion)
##
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
##
## Response: Distances
## Df Sum Sq Mean Sq F N.Perm Pr(>F)
## Groups 9 8.7974 0.97749 23.077 999 0.001 ***
## Residuals 314 13.3006 0.04236
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# yeap! We need to try a different test that is robust to heterogenous group variances...
################
# check significance:
# let's test for significance again using ANOSIM (which is another non-parametric test but this time only considering the ranks)
nmdsdata_test2_HeatxProt <- anosim(dist_mat,
grouping = paste(absDen_wide_forOrd$Heat, absDen_wide_forOrd$protegens),
permutations = 999)
plot(nmdsdata_test2_HeatxProt)
## Warning in (function (z, notch = FALSE, width = NULL, varwidth = FALSE, : some
## notches went outside hinges ('box'): maybe set notch=FALSE
summary(nmdsdata_test2_HeatxProt)
##
## Call:
## anosim(x = dist_mat, grouping = paste(absDen_wide_forOrd$Heat, absDen_wide_forOrd$protegens), permutations = 999)
## Dissimilarity: bray
##
## ANOSIM statistic R: 0.6212
## Significance: 0.001
##
## Permutation: free
## Number of permutations: 999
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0102 0.0144 0.0160 0.0211
##
## Dissimilarity ranks between and within classes:
## 0% 25% 50% 75% 100% N
## Between 15.5 15656.00 28220.0 44062.5 44062.5 47065
## 12 0 156.0 4841.00 22156.0 25514.0 44062.5 253
## 12 1 37.0 2073.25 6635.0 10519.0 18965.0 780
## 24 0 15.5 6506.00 22860.0 44062.5 44062.5 465
## 24 1 102.0 3052.00 5746.0 9735.5 17602.0 595
## 48 0 15.5 15176.00 26611.5 44062.5 44062.5 406
## 48 1 33.0 1305.00 3454.0 8128.5 20766.0 595
## 6 0 88.0 6343.00 21367.0 44062.5 44062.5 561
## 6 1 35.0 1550.50 3749.0 6333.5 19718.0 780
## control 0 685.0 8820.00 20181.0 20994.5 44062.5 231
## control 1 32.0 729.00 1856.0 4199.5 17322.0 595
# okay, let's say that we are satisfied with this significance testing...
test <- absDen_wide_forOrd
test$Heat <- as.character(levels(test$Heat))[test$Heat]
test$Heat[test$Heat == "control"] <- 0
test$Heat <- as.numeric(test$Heat)
# let's see what the heat gradient looks like
gg_ordiplot(try.NMDS, groups = absDen_wide_forOrd$protegens, plot = TRUE)
gg_envfit(try.NMDS, env = test$Heat, groups = absDen_wide_forOrd$protegens, plot = TRUE, alpha=0.5) # notice this gradient is not significant!!!
gg_envfit(try.NMDS, env = test$Heat, groups = absDen_wide_forOrd$protegens, plot = TRUE, alpha=0.5, choices=c(1,3))
gg_envfit(try.NMDS, env = test$Heat, groups = absDen_wide_forOrd$protegens, plot = TRUE, alpha=0.5, choices=c(2,3))
# clean up
rm(abundance_matrix, try.PCA, dist, try.PCoA, scree_out, try.NMDS, nmds_for_ggplot, ellipse12_df, ellipse13_df, ellipse23_df, nmdsdata_test_Heat, nmdsdata_test_Prot, nmdsdata_test_HeatxProt, dist_mat, dispersion, nmdsdata_test2_HeatxProt, test)
GREAT! This summarizes the same result that I found with the other indices: presence of P.protegens is the most important thing. Communities where this species was present look quite similar across different heat treatments. Longer heat durations push the communities toward different direction, until a threshold is reached at the longest heat treatment (48h).
The NMDS ordination results are significant by PERMANOVA but the assumptions of that test might be violated because the dispersal is heterogeneous between groups. I think ANOSIM should be somewhat more robust to this problem because it uses ranks. The NMDS ordination results are significant by ANOSIM.
The biggest variation in our data comes from the presence/absence of protegens. Let’s see how the ordination plot changes when we remove . A priori I would hope for better resolution of the heat effect, for example maybe one NMDS axis will split up different heat treatments. In addition, we can also check that the significant result we had above was not just due to the presence/absence of protegens.
# remove protegens from the data
absDen_wide_forOrd_NOprotegens <- absDen_wide_forOrd %>% filter(protegens == 0) %>%
select(-Pt_Resist, -Pt_earlyR, -Pt_lateR)
abundance_mat_NOprotegens <- as.matrix(absDen_wide_forOrd_NOprotegens[,9:17])
# re-do the NMDS
scree_out <- NMDS.scree(abundance_mat_NOprotegens)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.2839695
## Run 1 stress 0.3175657
## Run 2 stress 0.3362024
## Run 3 stress 0.333313
## Run 4 stress 0.390294
## Run 5 stress 0.3938142
## Run 6 stress 0.3165173
## Run 7 stress 0.3393281
## Run 8 stress 0.3390981
## Run 9 stress 0.3049849
## Run 10 stress 0.3402841
## Run 11 stress 0.3903438
## Run 12 stress 0.3250054
## Run 13 stress 0.3218248
## Run 14 stress 0.3387166
## Run 15 stress 0.3377149
## Run 16 stress 0.3369792
## Run 17 stress 0.3660445
## Run 18 stress 0.3904174
## Run 19 stress 0.3262429
## Run 20 stress 0.3373206
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 1: stress ratio > sratmax
## 19: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1248493
## Run 1 stress 0.1250885
## ... Procrustes: rmse 0.007106151 max resid 0.05492174
## Run 2 stress 0.1687007
## Run 3 stress 0.1249177
## ... Procrustes: rmse 0.01884325 max resid 0.1154213
## Run 4 stress 0.1249177
## ... Procrustes: rmse 0.01884441 max resid 0.1154208
## Run 5 stress 0.1313943
## Run 6 stress 0.1263682
## Run 7 stress 0.1249283
## ... Procrustes: rmse 0.0188611 max resid 0.1154277
## Run 8 stress 0.1304834
## Run 9 stress 0.126994
## Run 10 stress 0.130483
## Run 11 stress 0.1623552
## Run 12 stress 0.1324011
## Run 13 stress 0.1250898
## ... Procrustes: rmse 0.007061142 max resid 0.05482188
## Run 14 stress 0.1250129
## ... Procrustes: rmse 0.003893752 max resid 0.02037005
## Run 15 stress 0.1313584
## Run 16 stress 0.125027
## ... Procrustes: rmse 0.004030594 max resid 0.02061861
## Run 17 stress 0.130342
## Run 18 stress 0.1269923
## Run 19 stress 0.1248627
## ... Procrustes: rmse 0.0009462256 max resid 0.008039463
## ... Similar to previous best
## Run 20 stress 0.1263682
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.05412661
## Run 1 stress 0.05418547
## ... Procrustes: rmse 0.004255856 max resid 0.03600938
## Run 2 stress 0.05418545
## ... Procrustes: rmse 0.004259319 max resid 0.03592307
## Run 3 stress 0.09011906
## Run 4 stress 0.05418545
## ... Procrustes: rmse 0.004259535 max resid 0.03592114
## Run 5 stress 0.07155671
## Run 6 stress 0.08277447
## Run 7 stress 0.05410839
## ... New best solution
## ... Procrustes: rmse 0.002704684 max resid 0.02369291
## Run 8 stress 0.08774112
## Run 9 stress 0.05425096
## ... Procrustes: rmse 0.00400716 max resid 0.03377358
## Run 10 stress 0.08334157
## Run 11 stress 0.05412325
## ... Procrustes: rmse 0.002786496 max resid 0.0241508
## Run 12 stress 0.08277329
## Run 13 stress 0.05418549
## ... Procrustes: rmse 0.003113997 max resid 0.03515296
## Run 14 stress 0.0826511
## Run 15 stress 0.05410839
## ... Procrustes: rmse 8.467201e-05 max resid 0.0004572971
## ... Similar to previous best
## Run 16 stress 0.05418541
## ... Procrustes: rmse 0.003111459 max resid 0.03508001
## Run 17 stress 0.07529078
## Run 18 stress 0.0541854
## ... Procrustes: rmse 0.00311083 max resid 0.03508597
## Run 19 stress 0.05424892
## ... Procrustes: rmse 0.004003468 max resid 0.03383478
## Run 20 stress 0.0734376
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.03707215
## Run 1 stress 0.03698704
## ... New best solution
## ... Procrustes: rmse 0.01324558 max resid 0.08370901
## Run 2 stress 0.03731546
## ... Procrustes: rmse 0.01335285 max resid 0.08127817
## Run 3 stress 0.03731481
## ... Procrustes: rmse 0.006871849 max resid 0.04149852
## Run 4 stress 0.03698708
## ... Procrustes: rmse 0.0003579509 max resid 0.0025111
## ... Similar to previous best
## Run 5 stress 0.03742922
## ... Procrustes: rmse 0.01518346 max resid 0.08439727
## Run 6 stress 0.03740952
## ... Procrustes: rmse 0.008410133 max resid 0.04071321
## Run 7 stress 0.03712014
## ... Procrustes: rmse 0.005747494 max resid 0.03163269
## Run 8 stress 0.03726844
## ... Procrustes: rmse 0.009861556 max resid 0.04576025
## Run 9 stress 0.03734254
## ... Procrustes: rmse 0.01363658 max resid 0.08190515
## Run 10 stress 0.03708747
## ... Procrustes: rmse 0.00489239 max resid 0.02978029
## Run 11 stress 0.03730354
## ... Procrustes: rmse 0.01593757 max resid 0.08522964
## Run 12 stress 0.03708066
## ... Procrustes: rmse 0.004278425 max resid 0.02834168
## Run 13 stress 0.03732471
## ... Procrustes: rmse 0.01188458 max resid 0.07711417
## Run 14 stress 0.03726923
## ... Procrustes: rmse 0.009907008 max resid 0.04673843
## Run 15 stress 0.03727487
## ... Procrustes: rmse 0.01463379 max resid 0.08364876
## Run 16 stress 0.03726925
## ... Procrustes: rmse 0.009904963 max resid 0.04684978
## Run 17 stress 0.0373195
## ... Procrustes: rmse 0.01198782 max resid 0.07815677
## Run 18 stress 0.03733089
## ... Procrustes: rmse 0.01075686 max resid 0.04587885
## Run 19 stress 0.03738348
## ... Procrustes: rmse 0.007576349 max resid 0.03396717
## Run 20 stress 0.03746411
## ... Procrustes: rmse 0.01183616 max resid 0.05846897
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.02814612
## Run 1 stress 0.02618993
## ... New best solution
## ... Procrustes: rmse 0.03174274 max resid 0.1659187
## Run 2 stress 0.02683746
## Run 3 stress 0.02632749
## ... Procrustes: rmse 0.00694898 max resid 0.03172909
## Run 4 stress 0.02719725
## Run 5 stress 0.02757955
## Run 6 stress 0.02763427
## Run 7 stress 0.02687375
## Run 8 stress 0.02616093
## ... New best solution
## ... Procrustes: rmse 0.001937606 max resid 0.01787068
## Run 9 stress 0.02868266
## Run 10 stress 0.02739593
## Run 11 stress 0.02738523
## Run 12 stress 0.02680803
## Run 13 stress 0.02740542
## Run 14 stress 0.02782602
## Run 15 stress 0.0276018
## Run 16 stress 0.02903721
## Run 17 stress 0.02616115
## ... Procrustes: rmse 0.000264546 max resid 0.002466161
## ... Similar to previous best
## Run 18 stress 0.02758282
## Run 19 stress 0.02909625
## Run 20 stress 0.02900776
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01934155
## Run 1 stress 0.0202888
## Run 2 stress 0.02012645
## Run 3 stress 0.02072464
## Run 4 stress 0.02145413
## Run 5 stress 0.02147916
## Run 6 stress 0.02051126
## Run 7 stress 0.02097927
## Run 8 stress 0.0224204
## Run 9 stress 0.01965424
## ... Procrustes: rmse 0.006087394 max resid 0.03034424
## Run 10 stress 0.01989311
## Run 11 stress 0.02163056
## Run 12 stress 0.02176184
## Run 13 stress 0.02272983
## Run 14 stress 0.01935371
## ... Procrustes: rmse 0.004221676 max resid 0.03824777
## Run 15 stress 0.02092751
## Run 16 stress 0.02269975
## Run 17 stress 0.0198192
## ... Procrustes: rmse 0.01322906 max resid 0.05610947
## Run 18 stress 0.02133111
## Run 19 stress 0.02051542
## Run 20 stress 0.02046573
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01588334
## Run 1 stress 0.01605279
## ... Procrustes: rmse 0.01388664 max resid 0.06906938
## Run 2 stress 0.0155281
## ... New best solution
## ... Procrustes: rmse 0.02167119 max resid 0.114799
## Run 3 stress 0.01647974
## Run 4 stress 0.01634578
## Run 5 stress 0.0161698
## Run 6 stress 0.01621506
## Run 7 stress 0.01966735
## Run 8 stress 0.01667557
## Run 9 stress 0.01618637
## Run 10 stress 0.01605724
## Run 11 stress 0.01729961
## Run 12 stress 0.01744259
## Run 13 stress 0.0155176
## ... New best solution
## ... Procrustes: rmse 0.02097032 max resid 0.1283402
## Run 14 stress 0.01646742
## Run 15 stress 0.01589367
## ... Procrustes: rmse 0.01233329 max resid 0.09349999
## Run 16 stress 0.01671994
## Run 17 stress 0.0157092
## ... Procrustes: rmse 0.01227699 max resid 0.1152903
## Run 18 stress 0.01607292
## Run 19 stress 0.01805091
## Run 20 stress 0.01672859
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01270735
## Run 1 stress 0.01316965
## ... Procrustes: rmse 0.01782513 max resid 0.1076757
## Run 2 stress 0.01326157
## Run 3 stress 0.01416343
## Run 4 stress 0.0135578
## Run 5 stress 0.01291367
## ... Procrustes: rmse 0.01218309 max resid 0.1036377
## Run 6 stress 0.01341805
## Run 7 stress 0.01354722
## Run 8 stress 0.01323332
## Run 9 stress 0.01317864
## ... Procrustes: rmse 0.01869693 max resid 0.07463534
## Run 10 stress 0.01382488
## Run 11 stress 0.01289048
## ... Procrustes: rmse 0.01334894 max resid 0.1002192
## Run 12 stress 0.01280719
## ... Procrustes: rmse 0.0139939 max resid 0.1091469
## Run 13 stress 0.01318869
## ... Procrustes: rmse 0.02106624 max resid 0.1018908
## Run 14 stress 0.01290884
## ... Procrustes: rmse 0.0185039 max resid 0.09913366
## Run 15 stress 0.01383328
## Run 16 stress 0.01385491
## Run 17 stress 0.01309449
## ... Procrustes: rmse 0.01665929 max resid 0.1034686
## Run 18 stress 0.01333503
## Run 19 stress 0.01329769
## Run 20 stress 0.01341112
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01120204
## Run 1 stress 0.01174584
## Run 2 stress 0.01111186
## ... New best solution
## ... Procrustes: rmse 0.01328408 max resid 0.09097201
## Run 3 stress 0.01160213
## ... Procrustes: rmse 0.02113325 max resid 0.07198118
## Run 4 stress 0.01185129
## Run 5 stress 0.01144923
## ... Procrustes: rmse 0.01798342 max resid 0.0681696
## Run 6 stress 0.01165194
## Run 7 stress 0.01147266
## ... Procrustes: rmse 0.01935144 max resid 0.069256
## Run 8 stress 0.01084004
## ... New best solution
## ... Procrustes: rmse 0.0145207 max resid 0.09412452
## Run 9 stress 0.01129071
## ... Procrustes: rmse 0.01218298 max resid 0.05887723
## Run 10 stress 0.01106056
## ... Procrustes: rmse 0.01802868 max resid 0.09037901
## Run 11 stress 0.01179075
## Run 12 stress 0.01098015
## ... Procrustes: rmse 0.01163305 max resid 0.06087093
## Run 13 stress 0.01159316
## Run 14 stress 0.0113935
## Run 15 stress 0.01116806
## ... Procrustes: rmse 0.01467378 max resid 0.07557921
## Run 16 stress 0.01156458
## Run 17 stress 0.01160433
## Run 18 stress 0.01203585
## Run 19 stress 0.01129322
## ... Procrustes: rmse 0.01663057 max resid 0.06679415
## Run 20 stress 0.01176133
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.009757157
## Run 1 stress 0.0105964
## Run 2 stress 0.009852208
## ... Procrustes: rmse 0.01613108 max resid 0.09555465
## Run 3 stress 0.0103651
## Run 4 stress 0.01014537
## ... Procrustes: rmse 0.01722421 max resid 0.1108686
## Run 5 stress 0.01024912
## ... Procrustes: rmse 0.01871474 max resid 0.1037084
## Run 6 stress 0.01051402
## Run 7 stress 0.01018681
## ... Procrustes: rmse 0.01673181 max resid 0.07377377
## Run 8 stress 0.01012807
## ... Procrustes: rmse 0.02125302 max resid 0.09922811
## Run 9 stress 0.01028351
## Run 10 stress 0.01050806
## Run 11 stress 0.01032508
## Run 12 stress 0.009863289
## ... Procrustes: rmse 0.01418768 max resid 0.08992276
## Run 13 stress 0.01040881
## Run 14 stress 0.01018719
## ... Procrustes: rmse 0.01817146 max resid 0.09557653
## Run 15 stress 0.01066273
## Run 16 stress 0.01047916
## Run 17 stress 0.01024664
## ... Procrustes: rmse 0.0191997 max resid 0.09412685
## Run 18 stress 0.01054122
## Run 19 stress 0.01018761
## ... Procrustes: rmse 0.0178656 max resid 0.082517
## Run 20 stress 0.01012605
## ... Procrustes: rmse 0.02081159 max resid 0.1090301
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
plot(scree_out)
# again it's k=3 that looks appropriate
try.NMDS <- metaMDS(abundance_mat_NOprotegens, distance = "bray", k = 3, autotransform = TRUE, trymax=100)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.05412661
## Run 1 stress 0.05418548
## ... Procrustes: rmse 0.004257564 max resid 0.03594245
## Run 2 stress 0.0541084
## ... New best solution
## ... Procrustes: rmse 0.002712898 max resid 0.02378909
## Run 3 stress 0.07150905
## Run 4 stress 0.05412323
## ... Procrustes: rmse 0.002822298 max resid 0.02437797
## Run 5 stress 0.0541232
## ... Procrustes: rmse 0.002820257 max resid 0.02432982
## Run 6 stress 0.05418778
## ... Procrustes: rmse 0.003136694 max resid 0.0350438
## Run 7 stress 0.05410834
## ... New best solution
## ... Procrustes: rmse 7.949272e-05 max resid 0.0005000461
## ... Similar to previous best
## Run 8 stress 0.05418542
## ... Procrustes: rmse 0.003110106 max resid 0.0351119
## Run 9 stress 0.08646278
## Run 10 stress 0.05410843
## ... Procrustes: rmse 5.290327e-05 max resid 0.0003481475
## ... Similar to previous best
## Run 11 stress 0.08810552
## Run 12 stress 0.08428146
## Run 13 stress 0.05412549
## ... Procrustes: rmse 0.002830524 max resid 0.02436822
## Run 14 stress 0.08636735
## Run 15 stress 0.05410841
## ... Procrustes: rmse 2.828319e-05 max resid 0.0002106265
## ... Similar to previous best
## Run 16 stress 0.05418554
## ... Procrustes: rmse 0.003109871 max resid 0.03510139
## Run 17 stress 0.05418545
## ... Procrustes: rmse 0.003109582 max resid 0.03509269
## Run 18 stress 0.05418542
## ... Procrustes: rmse 0.003108307 max resid 0.03506992
## Run 19 stress 0.05410843
## ... Procrustes: rmse 5.520321e-05 max resid 0.0004264465
## ... Similar to previous best
## Run 20 stress 0.05410846
## ... Procrustes: rmse 9.39554e-05 max resid 0.0006446456
## ... Similar to previous best
## *** Best solution repeated 5 times
# plot the results for axis 1 & 2
ordiplot(try.NMDS, type = "n") # create blank ordination plot
orditorp(try.NMDS, display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(1,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(1,3), display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(2,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(2,3), display = "species", col="red", air = 0.1) # add species names in red
# AH-HA!! axes 2 & 3 are separating based on resistance vs recovery!
## I would like to try rotating the ordination axis to see if it's possible to better emphasize the resistance vs recovery separation...
# but this should NOT disregard the separation and trend of heat duration...
# visualize using ellipse plot for axis 1 & 2
ordiplot(try.NMDS)
ordiellipse(try.NMDS, absDen_wide_forOrd_NOprotegens$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120) # adding ellipses to the plot, grouping by distance (inverts$Distance)
legend("topright", title="Heat",
levels(absDen_wide_forOrd_NOprotegens$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ellipse plot for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.NMDS, choices = c(1,3), absDen_wide_forOrd_NOprotegens$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120)
legend("topright", title="Heat",
levels(absDen_wide_forOrd_NOprotegens$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ellipse plot for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.NMDS, choices = c(2,3), absDen_wide_forOrd_NOprotegens$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3, lty=1,
draw = "polygon", alpha=120)
legend("topright", title="Heat",
levels(absDen_wide_forOrd_NOprotegens$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# let's check if this gradient is significant...
test <- absDen_wide_forOrd_NOprotegens
test$Heat <- as.character(levels(test$Heat))[test$Heat]
test$Heat[test$Heat == "control"] <- 0
test$Heat <- as.numeric(test$Heat)
gg_envfit(try.NMDS, env = test$Heat, groups = absDen_wide_forOrd_NOprotegens$Heat, plot = TRUE, alpha=0.6)# notice this gradient is not significant!!!
gg_envfit(try.NMDS, env = test$Heat, groups = absDen_wide_forOrd_NOprotegens$Heat, plot = TRUE, alpha=0.6, choices=c(1,3))
gg_envfit(try.NMDS, env = test$Heat, groups = absDen_wide_forOrd_NOprotegens$Heat, plot = TRUE, alpha=0.6, choices=c(2,3))
## THIS GRADIENT IS NOT SIGNIFICANT!!!!
################
# check significance using a PERMANOVA to test the differences in community composition
nmdsdata_test_Heat <- adonis2(abundance_mat_NOprotegens ~ Heat, absDen_wide_forOrd_NOprotegens,
permutations = 999, method = "bray")
print(nmdsdata_test_Heat)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_mat_NOprotegens ~ Heat, data = absDen_wide_forOrd_NOprotegens, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 4 7.976 0.165 6.6195 0.001 ***
## Residual 134 40.362 0.835
## Total 138 48.338 1.000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# check PERMANOVA assumption of homogeneous group variances
# Bray-curtis distance matrix
dist_mat <- vegdist(abundance_mat_NOprotegens, method = "bray")
# use betadisper test to check for multivariate homogeneity of group variances
dispersion <- betadisper(dist_mat, group = absDen_wide_forOrd_NOprotegens$Heat)
permutest(dispersion)
##
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
##
## Response: Distances
## Df Sum Sq Mean Sq F N.Perm Pr(>F)
## Groups 4 0.241 0.060258 0.7306 999 0.573
## Residuals 134 11.052 0.082481
# Great!! the dispersion is homogeneous now :)
# Anyway, we can still check the significance of the NMDS using ANOSIM
nmdsdata_test2_Heat <- anosim(dist_mat,
grouping = absDen_wide_forOrd_NOprotegens$Heat,
permutations = 999)
plot(nmdsdata_test2_Heat)
summary(nmdsdata_test2_Heat)
##
## Call:
## anosim(x = dist_mat, grouping = absDen_wide_forOrd_NOprotegens$Heat, permutations = 999)
## Dissimilarity: bray
##
## ANOSIM statistic R: 0.1723
## Significance: 0.001
##
## Permutation: free
## Number of permutations: 999
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0189 0.0261 0.0333 0.0450
##
## Dissimilarity ranks between and within classes:
## 0% 25% 50% 75% 100% N
## Between 15.5 2663.5 5022.0 7177.0 8411.5 7675
## control 90.0 1130.5 3450.0 3929.5 8411.5 231
## 6 37.0 725.0 4256.0 8411.5 8411.5 561
## 12 39.0 523.0 4671.0 5251.0 8411.5 253
## 24 15.5 759.0 4886.0 8411.5 8411.5 465
## 48 15.5 2108.0 5400.5 8411.5 8411.5 406
# clean up
rm(absDen_wide_forOrd_NOprotegens, abundance_mat_NOprotegens, scree_out, try.NMDS, test, temp, nmdsdata_test_Heat, dist_mat, dispersion, nmdsdata_test2_Heat)
## Warning in rm(absDen_wide_forOrd_NOprotegens, abundance_mat_NOprotegens, :
## object 'temp' not found
The plots above show the communities without P. protegens. On the bright side I am finally seeing the separation of the resistance vs. recovery time points that I was hoping for (these can be seen on NMDS axes 2 & 3)… But on the other hand, we still see the 48h heat going back to overlap with the control. This is most likely what is producing the non-significant heat gradient shown in the terrible ggplots.
Possible thing TO DO: It would be to try rotating the NMDS so that one of the axes shows the sepration between resistance & recovery. see here, https://uw.pressbooks.pub/appliedmultivariatestatistics/chapter/visualizing-and-interpreting-ordinations/ The problem is that we want to rotate the NMDS using the column values (not an environmental variable)…
Let’s see what the NMDS looks like when we consider just the communities with P. protegens.
# remove protegens from the data
absDen_wide_forOrd_YESprotegens <- absDen_wide_forOrd %>% filter(protegens == 1)
abundance_mat_YESprotegens <- as.matrix(absDen_wide_forOrd_YESprotegens[,9:20])
# re-do the NMDS
scree_out <- NMDS.scree(abundance_mat_YESprotegens)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.370876
## Run 1 stress 0.4681745
## Run 2 stress 0.574197
## Run 3 stress 0.5742099
## Run 4 stress 0.4705634
## Run 5 stress 0.4734676
## Run 6 stress 0.4705311
## Run 7 stress 0.5741885
## Run 8 stress 0.4625914
## Run 9 stress 0.4584503
## Run 10 stress 0.473955
## Run 11 stress 0.5741558
## Run 12 stress 0.473897
## Run 13 stress 0.469525
## Run 14 stress 0.4655971
## Run 15 stress 0.4654771
## Run 16 stress 0.4713435
## Run 17 stress 0.4701758
## Run 18 stress 0.4588954
## Run 19 stress 0.470047
## Run 20 stress 0.469121
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 4: stress ratio > sratmax
## 16: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.2160476
## Run 1 stress 0.235491
## Run 2 stress 0.2332108
## Run 3 stress 0.2170011
## Run 4 stress 0.2339155
## Run 5 stress 0.2154501
## ... New best solution
## ... Procrustes: rmse 0.0322889 max resid 0.1733119
## Run 6 stress 0.2151676
## ... New best solution
## ... Procrustes: rmse 0.02200901 max resid 0.1809585
## Run 7 stress 0.2366953
## Run 8 stress 0.2175666
## Run 9 stress 0.2504994
## Run 10 stress 0.2174253
## Run 11 stress 0.2179639
## Run 12 stress 0.2242299
## Run 13 stress 0.2221021
## Run 14 stress 0.2165935
## Run 15 stress 0.2211208
## Run 16 stress 0.2192574
## Run 17 stress 0.2270675
## Run 18 stress 0.2265483
## Run 19 stress 0.2155847
## ... Procrustes: rmse 0.0225955 max resid 0.1315485
## Run 20 stress 0.2114126
## ... New best solution
## ... Procrustes: rmse 0.03819155 max resid 0.1745703
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 1: no. of iterations >= maxit
## 18: stress ratio > sratmax
## 1: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1482433
## Run 1 stress 0.1456813
## ... New best solution
## ... Procrustes: rmse 0.03965755 max resid 0.1939238
## Run 2 stress 0.1497792
## Run 3 stress 0.1514268
## Run 4 stress 0.1496895
## Run 5 stress 0.1471737
## Run 6 stress 0.1486873
## Run 7 stress 0.1482253
## Run 8 stress 0.1474813
## Run 9 stress 0.1474919
## Run 10 stress 0.1523312
## Run 11 stress 0.1484543
## Run 12 stress 0.1473178
## Run 13 stress 0.1511583
## Run 14 stress 0.1456387
## ... New best solution
## ... Procrustes: rmse 0.004274682 max resid 0.04351588
## Run 15 stress 0.1480893
## Run 16 stress 0.149069
## Run 17 stress 0.1476921
## Run 18 stress 0.1500745
## Run 19 stress 0.1523591
## Run 20 stress 0.1479497
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 10: no. of iterations >= maxit
## 10: stress ratio > sratmax
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.09973964
## Run 1 stress 0.09973998
## ... Procrustes: rmse 0.0005561904 max resid 0.003502212
## ... Similar to previous best
## Run 2 stress 0.09972797
## ... New best solution
## ... Procrustes: rmse 0.001348257 max resid 0.01173098
## Run 3 stress 0.09997666
## ... Procrustes: rmse 0.007618021 max resid 0.06493934
## Run 4 stress 0.09963688
## ... New best solution
## ... Procrustes: rmse 0.002991554 max resid 0.032299
## Run 5 stress 0.09995202
## ... Procrustes: rmse 0.003225794 max resid 0.02645518
## Run 6 stress 0.0999824
## ... Procrustes: rmse 0.008459488 max resid 0.06359316
## Run 7 stress 0.09961919
## ... New best solution
## ... Procrustes: rmse 0.001228445 max resid 0.01473751
## Run 8 stress 0.09974154
## ... Procrustes: rmse 0.005019653 max resid 0.04479507
## Run 9 stress 0.1011065
## Run 10 stress 0.09963672
## ... Procrustes: rmse 0.001351835 max resid 0.01577513
## Run 11 stress 0.09964423
## ... Procrustes: rmse 0.001660032 max resid 0.01858981
## Run 12 stress 0.1005342
## Run 13 stress 0.1039103
## Run 14 stress 0.1017818
## Run 15 stress 0.09961881
## ... New best solution
## ... Procrustes: rmse 0.0002662908 max resid 0.001046038
## ... Similar to previous best
## Run 16 stress 0.09961922
## ... Procrustes: rmse 0.0002339491 max resid 0.001392829
## ... Similar to previous best
## Run 17 stress 0.0997031
## ... Procrustes: rmse 0.002334775 max resid 0.02314429
## Run 18 stress 0.1009048
## Run 19 stress 0.1041863
## Run 20 stress 0.1004246
## *** Best solution repeated 2 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.0725733
## Run 1 stress 0.07257264
## ... New best solution
## ... Procrustes: rmse 0.0003194622 max resid 0.003090981
## ... Similar to previous best
## Run 2 stress 0.07309919
## Run 3 stress 0.07257296
## ... Procrustes: rmse 0.0006325145 max resid 0.004418431
## ... Similar to previous best
## Run 4 stress 0.0726014
## ... Procrustes: rmse 0.0014311 max resid 0.008801278
## ... Similar to previous best
## Run 5 stress 0.07257292
## ... Procrustes: rmse 0.0006074701 max resid 0.00416851
## ... Similar to previous best
## Run 6 stress 0.0725716
## ... New best solution
## ... Procrustes: rmse 0.0003445536 max resid 0.002397111
## ... Similar to previous best
## Run 7 stress 0.07257164
## ... Procrustes: rmse 0.0001865404 max resid 0.001757612
## ... Similar to previous best
## Run 8 stress 0.07302048
## ... Procrustes: rmse 0.01166279 max resid 0.07754701
## Run 9 stress 0.07311551
## Run 10 stress 0.0798058
## Run 11 stress 0.07263383
## ... Procrustes: rmse 0.003372631 max resid 0.01994465
## Run 12 stress 0.07257237
## ... Procrustes: rmse 0.0002009512 max resid 0.001771005
## ... Similar to previous best
## Run 13 stress 0.07263565
## ... Procrustes: rmse 0.00257157 max resid 0.01482111
## Run 14 stress 0.07257297
## ... Procrustes: rmse 0.0003660621 max resid 0.002133192
## ... Similar to previous best
## Run 15 stress 0.072587
## ... Procrustes: rmse 0.001412232 max resid 0.009790739
## ... Similar to previous best
## Run 16 stress 0.07257239
## ... Procrustes: rmse 0.0003096933 max resid 0.002062985
## ... Similar to previous best
## Run 17 stress 0.07257251
## ... Procrustes: rmse 0.0002471723 max resid 0.002096822
## ... Similar to previous best
## Run 18 stress 0.0725729
## ... Procrustes: rmse 0.0002980323 max resid 0.001699173
## ... Similar to previous best
## Run 19 stress 0.07257307
## ... Procrustes: rmse 0.0004027589 max resid 0.002250486
## ... Similar to previous best
## Run 20 stress 0.07260418
## ... Procrustes: rmse 0.002207701 max resid 0.01414373
## *** Best solution repeated 9 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.05814966
## Run 1 stress 0.0576679
## ... New best solution
## ... Procrustes: rmse 0.02483569 max resid 0.1526146
## Run 2 stress 0.05822194
## Run 3 stress 0.05824553
## Run 4 stress 0.05821129
## Run 5 stress 0.057846
## ... Procrustes: rmse 0.008050105 max resid 0.09541474
## Run 6 stress 0.05767447
## ... Procrustes: rmse 0.0008529801 max resid 0.005290324
## ... Similar to previous best
## Run 7 stress 0.05774903
## ... Procrustes: rmse 0.007146384 max resid 0.0887752
## Run 8 stress 0.05766905
## ... Procrustes: rmse 0.0006205188 max resid 0.005488768
## ... Similar to previous best
## Run 9 stress 0.05816657
## ... Procrustes: rmse 0.02395012 max resid 0.1495198
## Run 10 stress 0.05780962
## ... Procrustes: rmse 0.008027754 max resid 0.09780724
## Run 11 stress 0.05930532
## Run 12 stress 0.05822384
## Run 13 stress 0.05848534
## Run 14 stress 0.05844978
## Run 15 stress 0.05767054
## ... Procrustes: rmse 0.0004412508 max resid 0.003981539
## ... Similar to previous best
## Run 16 stress 0.05768654
## ... Procrustes: rmse 0.001586304 max resid 0.009639436
## ... Similar to previous best
## Run 17 stress 0.05849238
## Run 18 stress 0.05766817
## ... Procrustes: rmse 0.0001196741 max resid 0.0008132956
## ... Similar to previous best
## Run 19 stress 0.05820264
## Run 20 stress 0.0581483
## ... Procrustes: rmse 0.0246125 max resid 0.1516583
## *** Best solution repeated 5 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.04791283
## Run 1 stress 0.04802536
## ... Procrustes: rmse 0.003708177 max resid 0.03599999
## Run 2 stress 0.05032456
## Run 3 stress 0.04796402
## ... Procrustes: rmse 0.00301236 max resid 0.01682434
## Run 4 stress 0.05047483
## Run 5 stress 0.0482666
## ... Procrustes: rmse 0.00959043 max resid 0.112046
## Run 6 stress 0.04802682
## ... Procrustes: rmse 0.004291973 max resid 0.01788572
## Run 7 stress 0.04938845
## Run 8 stress 0.04797709
## ... Procrustes: rmse 0.003196213 max resid 0.01533002
## Run 9 stress 0.04792674
## ... Procrustes: rmse 0.001133911 max resid 0.007916974
## ... Similar to previous best
## Run 10 stress 0.04808362
## ... Procrustes: rmse 0.006519017 max resid 0.0227698
## Run 11 stress 0.04806061
## ... Procrustes: rmse 0.003716663 max resid 0.01672056
## Run 12 stress 0.05062234
## Run 13 stress 0.04813121
## ... Procrustes: rmse 0.007353495 max resid 0.02348682
## Run 14 stress 0.04795571
## ... Procrustes: rmse 0.002701568 max resid 0.01573783
## Run 15 stress 0.05072726
## Run 16 stress 0.04815246
## ... Procrustes: rmse 0.005894454 max resid 0.07032434
## Run 17 stress 0.04797026
## ... Procrustes: rmse 0.003427751 max resid 0.01698108
## Run 18 stress 0.0482177
## ... Procrustes: rmse 0.008923761 max resid 0.108388
## Run 19 stress 0.04792936
## ... Procrustes: rmse 0.00174285 max resid 0.02066037
## Run 20 stress 0.04827439
## ... Procrustes: rmse 0.00901088 max resid 0.1052376
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.04125626
## Run 1 stress 0.04201108
## Run 2 stress 0.04207672
## Run 3 stress 0.04236719
## Run 4 stress 0.0414552
## ... Procrustes: rmse 0.02002105 max resid 0.07642855
## Run 5 stress 0.04151284
## ... Procrustes: rmse 0.00956349 max resid 0.09020352
## Run 6 stress 0.04124682
## ... New best solution
## ... Procrustes: rmse 0.002996605 max resid 0.02223987
## Run 7 stress 0.04128467
## ... Procrustes: rmse 0.009777887 max resid 0.04977943
## Run 8 stress 0.04260379
## Run 9 stress 0.04187112
## Run 10 stress 0.04124635
## ... New best solution
## ... Procrustes: rmse 0.002675721 max resid 0.01988723
## Run 11 stress 0.04158365
## ... Procrustes: rmse 0.02056787 max resid 0.0790522
## Run 12 stress 0.04124158
## ... New best solution
## ... Procrustes: rmse 0.002120486 max resid 0.01335083
## Run 13 stress 0.04230653
## Run 14 stress 0.04239336
## Run 15 stress 0.04223782
## Run 16 stress 0.04146907
## ... Procrustes: rmse 0.02113384 max resid 0.1136306
## Run 17 stress 0.04147001
## ... Procrustes: rmse 0.005353762 max resid 0.03115803
## Run 18 stress 0.04131044
## ... Procrustes: rmse 0.002722603 max resid 0.01761573
## Run 19 stress 0.04159933
## ... Procrustes: rmse 0.01604176 max resid 0.06917874
## Run 20 stress 0.04145856
## ... Procrustes: rmse 0.01753338 max resid 0.1223633
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.03488104
## Run 1 stress 0.03488968
## ... Procrustes: rmse 0.0007756289 max resid 0.007176625
## ... Similar to previous best
## Run 2 stress 0.03524159
## ... Procrustes: rmse 0.008304594 max resid 0.09270213
## Run 3 stress 0.0372763
## Run 4 stress 0.03617726
## Run 5 stress 0.03500679
## ... Procrustes: rmse 0.004125578 max resid 0.03805999
## Run 6 stress 0.0352001
## ... Procrustes: rmse 0.008884862 max resid 0.1088861
## Run 7 stress 0.03601339
## Run 8 stress 0.03531781
## ... Procrustes: rmse 0.01003636 max resid 0.1156378
## Run 9 stress 0.03539252
## Run 10 stress 0.03584352
## Run 11 stress 0.03496084
## ... Procrustes: rmse 0.005221199 max resid 0.061525
## Run 12 stress 0.03554171
## Run 13 stress 0.03529509
## ... Procrustes: rmse 0.006817072 max resid 0.06099992
## Run 14 stress 0.03588236
## Run 15 stress 0.0358881
## Run 16 stress 0.0350866
## ... Procrustes: rmse 0.006930338 max resid 0.08056406
## Run 17 stress 0.0351135
## ... Procrustes: rmse 0.007828037 max resid 0.09706856
## Run 18 stress 0.03503163
## ... Procrustes: rmse 0.003768092 max resid 0.03208957
## Run 19 stress 0.0351785
## ... Procrustes: rmse 0.006660188 max resid 0.06430347
## Run 20 stress 0.03698452
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.0302123
## Run 1 stress 0.03161391
## Run 2 stress 0.03095315
## Run 3 stress 0.03090809
## Run 4 stress 0.03110426
## Run 5 stress 0.03243391
## Run 6 stress 0.03194177
## Run 7 stress 0.03035967
## ... Procrustes: rmse 0.007249735 max resid 0.05672009
## Run 8 stress 0.03104679
## Run 9 stress 0.03146606
## Run 10 stress 0.03067574
## ... Procrustes: rmse 0.009834559 max resid 0.07180621
## Run 11 stress 0.03113342
## Run 12 stress 0.03052824
## ... Procrustes: rmse 0.003823307 max resid 0.01602245
## Run 13 stress 0.03164037
## Run 14 stress 0.03276051
## Run 15 stress 0.03118566
## Run 16 stress 0.03032166
## ... Procrustes: rmse 0.008204051 max resid 0.07366655
## Run 17 stress 0.03174641
## Run 18 stress 0.03105872
## Run 19 stress 0.03074085
## Run 20 stress 0.03155444
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
plot(scree_out)
# this looks worse than what we had above. Let's choose k=3 again because otherwise we would be using something like 6 and it's very hard to interpret larger dimensions...
try.NMDS <- metaMDS(abundance_mat_YESprotegens, distance = "bray", k = 3, autotransform = TRUE, trymax=100)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1482433
## Run 1 stress 0.1489441
## Run 2 stress 0.148394
## ... Procrustes: rmse 0.03681961 max resid 0.1928881
## Run 3 stress 0.1489384
## Run 4 stress 0.1486591
## ... Procrustes: rmse 0.04134371 max resid 0.1689576
## Run 5 stress 0.1478662
## ... New best solution
## ... Procrustes: rmse 0.0363405 max resid 0.1597604
## Run 6 stress 0.1455627
## ... New best solution
## ... Procrustes: rmse 0.04808803 max resid 0.1656343
## Run 7 stress 0.1509178
## Run 8 stress 0.1482218
## Run 9 stress 0.1484081
## Run 10 stress 0.1463412
## Run 11 stress 0.1493465
## Run 12 stress 0.1480063
## Run 13 stress 0.1455942
## ... Procrustes: rmse 0.003776159 max resid 0.03961331
## Run 14 stress 0.1455909
## ... Procrustes: rmse 0.004008198 max resid 0.04486445
## Run 15 stress 0.1506981
## Run 16 stress 0.1473028
## Run 17 stress 0.1517896
## Run 18 stress 0.15032
## Run 19 stress 0.1476959
## Run 20 stress 0.1499465
## Run 21 stress 0.1500067
## Run 22 stress 0.1478383
## Run 23 stress 0.14588
## ... Procrustes: rmse 0.006519738 max resid 0.05649222
## Run 24 stress 0.1455899
## ... Procrustes: rmse 0.003885311 max resid 0.04361052
## Run 25 stress 0.1466042
## Run 26 stress 0.1494831
## Run 27 stress 0.1505864
## Run 28 stress 0.1499471
## Run 29 stress 0.1500075
## Run 30 stress 0.1521868
## Run 31 stress 0.1469362
## Run 32 stress 0.1491867
## Run 33 stress 0.1507483
## Run 34 stress 0.149585
## Run 35 stress 0.1485476
## Run 36 stress 0.1488649
## Run 37 stress 0.1494758
## Run 38 stress 0.1483997
## Run 39 stress 0.1480531
## Run 40 stress 0.1490321
## Run 41 stress 0.1466811
## Run 42 stress 0.1469902
## Run 43 stress 0.1497476
## Run 44 stress 0.1482443
## Run 45 stress 0.1470318
## Run 46 stress 0.1477852
## Run 47 stress 0.1495831
## Run 48 stress 0.1488145
## Run 49 stress 0.1500086
## Run 50 stress 0.1484542
## Run 51 stress 0.1517324
## Run 52 stress 0.1515056
## Run 53 stress 0.1476527
## Run 54 stress 0.1475296
## Run 55 stress 0.1456407
## ... Procrustes: rmse 0.01268934 max resid 0.07960572
## Run 56 stress 0.1483926
## Run 57 stress 0.1479171
## Run 58 stress 0.1499533
## Run 59 stress 0.1487001
## Run 60 stress 0.1476917
## Run 61 stress 0.150504
## Run 62 stress 0.1525764
## Run 63 stress 0.1464675
## Run 64 stress 0.1476523
## Run 65 stress 0.1483516
## Run 66 stress 0.1484862
## Run 67 stress 0.1501227
## Run 68 stress 0.1522436
## Run 69 stress 0.1481218
## Run 70 stress 0.1497821
## Run 71 stress 0.1479162
## Run 72 stress 0.1489412
## Run 73 stress 0.1486362
## Run 74 stress 0.1471402
## Run 75 stress 0.1478571
## Run 76 stress 0.1455908
## ... Procrustes: rmse 0.004022627 max resid 0.04487824
## Run 77 stress 0.1516185
## Run 78 stress 0.1486013
## Run 79 stress 0.1488615
## Run 80 stress 0.1456374
## ... Procrustes: rmse 0.01262305 max resid 0.07525878
## Run 81 stress 0.1473432
## Run 82 stress 0.1523155
## Run 83 stress 0.1485357
## Run 84 stress 0.1456386
## ... Procrustes: rmse 0.0126663 max resid 0.07813525
## Run 85 stress 0.1502793
## Run 86 stress 0.1456834
## ... Procrustes: rmse 0.002644083 max resid 0.031996
## Run 87 stress 0.1502627
## Run 88 stress 0.1476363
## Run 89 stress 0.1498797
## Run 90 stress 0.150748
## Run 91 stress 0.1471715
## Run 92 stress 0.1486282
## Run 93 stress 0.154804
## Run 94 stress 0.1457923
## ... Procrustes: rmse 0.004647124 max resid 0.03333261
## Run 95 stress 0.1472113
## Run 96 stress 0.148393
## Run 97 stress 0.1476113
## Run 98 stress 0.149775
## Run 99 stress 0.1456385
## ... Procrustes: rmse 0.01280332 max resid 0.07935037
## Run 100 stress 0.1493803
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 58: no. of iterations >= maxit
## 42: stress ratio > sratmax
try.NMDS$stress # < 0.1 is considered ideal but anything < 0.2 is still good.
## [1] 0.1455627
# plot the results for axis 1 & 2
ordiplot(try.NMDS, type = "n") # create blank ordination plot
orditorp(try.NMDS, display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(1,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(1,3), display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(2,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(2,3), display = "species", col="red", air = 0.1) # add species names in red
# visualize using ellipse plot for axis 1 & 2
ordiplot(try.NMDS)
ordiellipse(try.NMDS, absDen_wide_forOrd_YESprotegens$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120) # adding ellipses to the plot, grouping by distance (inverts$Distance)
legend("topright", title="Heat",
levels(absDen_wide_forOrd_YESprotegens$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ellipse plot for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.NMDS, choices = c(1,3), absDen_wide_forOrd_YESprotegens$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3,
draw = "polygon", alpha=120)
legend("topright", title="Heat",
levels(absDen_wide_forOrd_YESprotegens$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# ellipse plot for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3)) # plot shows communities (circles) and species:time (crosses)
ordiellipse(try.NMDS, choices = c(2,3), absDen_wide_forOrd_YESprotegens$Heat, label = FALSE,
border =c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), lwd=3, lty=1,
draw = "polygon", alpha=120)
legend("topright", title="Heat",
levels(absDen_wide_forOrd_YESprotegens$Heat),
fill=c("bisque1", "orange", "brown1", "darkorchid1", "deepskyblue2"), horiz=FALSE, cex=.9)
# let's check if this gradient is significant...
test <- absDen_wide_forOrd_YESprotegens
test$Heat <- as.character(levels(test$Heat))[test$Heat]
test$Heat[test$Heat == "control"] <- 0
test$Heat <- as.numeric(test$Heat)
# for proper labelling of the arrow, need to supply env as a data.frame
temp <- as.data.frame(test$Heat)
colnames(temp) <- "Heat"
## AWESOME! the gradient is finally significant.
gg_envfit(try.NMDS, env = temp, groups = absDen_wide_forOrd_YESprotegens$Heat, plot = TRUE, alpha=0.01)
gg_envfit(try.NMDS, env = temp, groups = absDen_wide_forOrd_YESprotegens$Heat, plot = TRUE, alpha=0.01, choices=c(1,3))
gg_envfit(try.NMDS, env = temp, groups = absDen_wide_forOrd_YESprotegens$Heat, plot = TRUE, alpha=0.01, choices=c(2,3))
## try plotting it as a surface contour:
ordisurf(try.NMDS ~ temp$Heat)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 8.03 total = 9.03
##
## REML score: 664.4488
ordisurf(try.NMDS ~ temp$Heat, choices = c(1,3))
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 7.34 total = 8.34
##
## REML score: 731.1893
ordisurf(try.NMDS ~ temp$Heat, choices = c(2,3))
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 6.39 total = 7.39
##
## REML score: 733.743
################
# check significance of heat categories using a PERMANOVA to test the differences in community composition
nmdsdata_test_Heat <- adonis2(abundance_mat_YESprotegens ~ Heat, absDen_wide_forOrd_YESprotegens,
permutations = 999, method = "bray")
print(nmdsdata_test_Heat)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_mat_YESprotegens ~ Heat, data = absDen_wide_forOrd_YESprotegens, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 4 7.7803 0.51442 47.673 0.001 ***
## Residual 180 7.3440 0.48558
## Total 184 15.1242 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# check PERMANOVA assumption of homogeneous group variances
# Bray-curtis distance matrix
dist_mat <- vegdist(abundance_mat_YESprotegens, method = "bray")
# use betadisper test to check for multivariate homogeneity of group variances
dispersion <- betadisper(dist_mat, group = absDen_wide_forOrd_YESprotegens$Heat)
## Warning in betadisper(dist_mat, group = absDen_wide_forOrd_YESprotegens$Heat):
## some squared distances are negative and changed to zero
permutest(dispersion)
##
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
##
## Response: Distances
## Df Sum Sq Mean Sq F N.Perm Pr(>F)
## Groups 4 0.16565 0.041411 3.3053 999 0.011 *
## Residuals 180 2.25518 0.012529
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# The dispersion is having some issues still...
# Let's double check the significance of the NMDS using ANOSIM
nmdsdata_test2_Heat <- anosim(dist_mat,
grouping = absDen_wide_forOrd_YESprotegens$Heat,
permutations = 999)
plot(nmdsdata_test2_Heat)
summary(nmdsdata_test2_Heat)
##
## Call:
## anosim(x = dist_mat, grouping = absDen_wide_forOrd_YESprotegens$Heat, permutations = 999)
## Dissimilarity: bray
##
## ANOSIM statistic R: 0.5424
## Significance: 0.001
##
## Permutation: free
## Number of permutations: 999
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0125 0.0178 0.0213 0.0265
##
## Dissimilarity ranks between and within classes:
## 0% 25% 50% 75% 100% N
## Between 15 5620.50 9692.0 13422.5 17020 13675
## control 1 635.00 1654.0 3750.0 14815 595
## 6 4 1377.50 3347.0 5610.5 16484 780
## 12 6 1842.75 5857.5 9136.0 16035 780
## 24 65 2723.00 5115.0 8461.0 15035 595
## 48 2 1156.50 3085.0 7114.5 16982 595
# clean up
rm(absDen_wide_forOrd_YESprotegens, abundance_mat_YESprotegens, scree_out, try.NMDS, test, temp, nmdsdata_test_Heat, dist_mat, dispersion, nmdsdata_test2_Heat)
Okay. So when we look at the communities only with P. protegens, we
are able to find a significant gradient created by heat
duration. The 48h duration (and 24h somewhat) is still quite different
from the others, but at least heat moves the commmunities in a
consistent direction now. Notice that even when we plot the heat
duration surface (using ordisurf), I think the bulged shape
of the contour lines is telling us that the effect of heat duration is
not really (log) linear.
Unfortunately, we cannot resolve any separation between the resistance and recovery time points…
I guess the other thing we know is that the replicates which go extinct have a big impact on the data, especially at 48h duration for communities WITHOUT P. protegens. Above, we saw that communities WITH P. protegens show a gradient effect of heat duration…
So maybe if we throw out all the extinct wells but consider all communities, we can recover the constant effect of heat duration? (i.e., maybe it’s the 19 extinct wells that are causing this shift in the 48h duration to look more like control)
# keep only the rows where the communities did NOT go extinct
absDen_wide_forOrd_REMOVEextinct <- absDen_wide_forOrd %>% filter(uniqID %in% extinct.df$uniqID[extinct.df$survived == 1])
abundance_mat_REMOVEextinct <- as.matrix(absDen_wide_forOrd_REMOVEextinct[,9:20])
# check the best number of dimensions for NMDS
scree_out <- NMDS.scree(abundance_mat_REMOVEextinct)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.2226219
## Run 1 stress 0.3270418
## Run 2 stress 0.3290109
## Run 3 stress 0.3224916
## Run 4 stress 0.2795501
## Run 5 stress 0.3090371
## Run 6 stress 0.2947786
## Run 7 stress 0.413065
## Run 8 stress 0.2751921
## Run 9 stress 0.2795654
## Run 10 stress 0.3457175
## Run 11 stress 0.2767806
## Run 12 stress 0.2824923
## Run 13 stress 0.2601246
## Run 14 stress 0.2372844
## Run 15 stress 0.2514252
## Run 16 stress 0.2789774
## Run 17 stress 0.2755725
## Run 18 stress 0.2594534
## Run 19 stress 0.2721104
## Run 20 stress 0.2551188
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 3: stress ratio > sratmax
## 17: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.07360383
## Run 1 stress 0.1421991
## Run 2 stress 0.1148873
## Run 3 stress 0.1282334
## Run 4 stress 0.1103001
## Run 5 stress 0.1116954
## Run 6 stress 0.1305709
## Run 7 stress 0.1207495
## Run 8 stress 0.1202432
## Run 9 stress 0.1479339
## Run 10 stress 0.07360384
## ... Procrustes: rmse 3.887653e-05 max resid 0.0003183676
## ... Similar to previous best
## Run 11 stress 0.1533459
## Run 12 stress 0.1240419
## Run 13 stress 0.1347851
## Run 14 stress 0.134878
## Run 15 stress 0.1230591
## Run 16 stress 0.09367789
## Run 17 stress 0.1371836
## Run 18 stress 0.1492895
## Run 19 stress 0.1332698
## Run 20 stress 0.1285227
## *** Best solution repeated 1 times
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.03497605
## Run 1 stress 0.04298089
## Run 2 stress 0.03525006
## ... Procrustes: rmse 0.005598757 max resid 0.03588643
## Run 3 stress 0.0351187
## ... Procrustes: rmse 0.0008098494 max resid 0.01307941
## Run 4 stress 0.04458539
## Run 5 stress 0.0351199
## ... Procrustes: rmse 0.0007831421 max resid 0.01306356
## Run 6 stress 0.03504664
## ... Procrustes: rmse 0.01508825 max resid 0.04313535
## Run 7 stress 0.04771949
## Run 8 stress 0.03517531
## ... Procrustes: rmse 0.01602821 max resid 0.05454132
## Run 9 stress 0.03513851
## ... Procrustes: rmse 0.001337497 max resid 0.01440978
## Run 10 stress 0.03671395
## Run 11 stress 0.03555869
## Run 12 stress 0.04427155
## Run 13 stress 0.03527022
## ... Procrustes: rmse 0.005681001 max resid 0.03567179
## Run 14 stress 0.03503182
## ... Procrustes: rmse 0.01504217 max resid 0.04275194
## Run 15 stress 0.03520342
## ... Procrustes: rmse 0.01506778 max resid 0.04334043
## Run 16 stress 0.0350322
## ... Procrustes: rmse 0.0150449 max resid 0.04274572
## Run 17 stress 0.03684527
## Run 18 stress 0.03525558
## ... Procrustes: rmse 0.005559473 max resid 0.03578915
## Run 19 stress 0.03517535
## ... Procrustes: rmse 0.01602843 max resid 0.0545571
## Run 20 stress 0.03516895
## ... Procrustes: rmse 0.01604355 max resid 0.05454463
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 1: no. of iterations >= maxit
## 19: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.02335853
## Run 1 stress 0.02361979
## ... Procrustes: rmse 0.02008993 max resid 0.05165137
## Run 2 stress 0.02491809
## Run 3 stress 0.02383549
## ... Procrustes: rmse 0.008606673 max resid 0.05239591
## Run 4 stress 0.02429759
## Run 5 stress 0.02444666
## Run 6 stress 0.02474817
## Run 7 stress 0.02429905
## Run 8 stress 0.02374604
## ... Procrustes: rmse 0.008697983 max resid 0.05293635
## Run 9 stress 0.02442553
## Run 10 stress 0.02429553
## Run 11 stress 0.02356121
## ... Procrustes: rmse 0.004096523 max resid 0.02791802
## Run 12 stress 0.02459354
## Run 13 stress 0.02508394
## Run 14 stress 0.0251637
## Run 15 stress 0.0243454
## Run 16 stress 0.02338848
## ... Procrustes: rmse 0.0200133 max resid 0.05136738
## Run 17 stress 0.02511694
## Run 18 stress 0.02467668
## Run 19 stress 0.02477052
## Run 20 stress 0.02456015
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 10: no. of iterations >= maxit
## 10: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01706263
## Run 1 stress 0.01722232
## ... Procrustes: rmse 0.006914601 max resid 0.04524389
## Run 2 stress 0.01739903
## ... Procrustes: rmse 0.008318851 max resid 0.05159938
## Run 3 stress 0.01680363
## ... New best solution
## ... Procrustes: rmse 0.01108565 max resid 0.05692945
## Run 4 stress 0.01714706
## ... Procrustes: rmse 0.01023101 max resid 0.05719691
## Run 5 stress 0.01711653
## ... Procrustes: rmse 0.01165597 max resid 0.05291328
## Run 6 stress 0.01702944
## ... Procrustes: rmse 0.006435934 max resid 0.04179216
## Run 7 stress 0.01702344
## ... Procrustes: rmse 0.008496399 max resid 0.0447002
## Run 8 stress 0.01688558
## ... Procrustes: rmse 0.009479266 max resid 0.05622467
## Run 9 stress 0.01748861
## Run 10 stress 0.01736758
## Run 11 stress 0.01786957
## Run 12 stress 0.01714848
## ... Procrustes: rmse 0.007243098 max resid 0.03529455
## Run 13 stress 0.01744258
## Run 14 stress 0.01747819
## Run 15 stress 0.0170088
## ... Procrustes: rmse 0.0101351 max resid 0.0618685
## Run 16 stress 0.0179919
## Run 17 stress 0.01753559
## Run 18 stress 0.01805619
## Run 19 stress 0.01703597
## ... Procrustes: rmse 0.01312336 max resid 0.05681934
## Run 20 stress 0.01758987
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 19: no. of iterations >= maxit
## 1: scale factor of the gradient < sfgrmin
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.012613
## Run 1 stress 0.01341585
## Run 2 stress 0.01323461
## Run 3 stress 0.01335556
## Run 4 stress 0.01292373
## ... Procrustes: rmse 0.01249242 max resid 0.06810562
## Run 5 stress 0.01283785
## ... Procrustes: rmse 0.005055423 max resid 0.03629281
## Run 6 stress 0.0126987
## ... Procrustes: rmse 0.00303731 max resid 0.02410796
## Run 7 stress 0.01340261
## Run 8 stress 0.01257247
## ... New best solution
## ... Procrustes: rmse 0.005163648 max resid 0.02677521
## Run 9 stress 0.01393608
## Run 10 stress 0.01272978
## ... Procrustes: rmse 0.01170964 max resid 0.06307661
## Run 11 stress 0.01312396
## Run 12 stress 0.01309318
## Run 13 stress 0.01319632
## Run 14 stress 0.01257967
## ... Procrustes: rmse 0.00364189 max resid 0.0117858
## Run 15 stress 0.01333995
## Run 16 stress 0.01329863
## Run 17 stress 0.01276669
## ... Procrustes: rmse 0.01136297 max resid 0.06595586
## Run 18 stress 0.01302607
## ... Procrustes: rmse 0.00563871 max resid 0.02999376
## Run 19 stress 0.01308329
## Run 20 stress 0.01266793
## ... Procrustes: rmse 0.0110489 max resid 0.06792136
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.01018056
## Run 1 stress 0.01082652
## Run 2 stress 0.01043103
## ... Procrustes: rmse 0.008531717 max resid 0.04000073
## Run 3 stress 0.01060217
## ... Procrustes: rmse 0.007638138 max resid 0.03751403
## Run 4 stress 0.01151359
## Run 5 stress 0.01031646
## ... Procrustes: rmse 0.009200684 max resid 0.05888363
## Run 6 stress 0.01069023
## Run 7 stress 0.01031426
## ... Procrustes: rmse 0.007776684 max resid 0.0353911
## Run 8 stress 0.01045226
## ... Procrustes: rmse 0.01164266 max resid 0.07213054
## Run 9 stress 0.01066145
## ... Procrustes: rmse 0.007491748 max resid 0.0434988
## Run 10 stress 0.01077333
## Run 11 stress 0.0106015
## ... Procrustes: rmse 0.006133618 max resid 0.04092284
## Run 12 stress 0.01072753
## Run 13 stress 0.01056273
## ... Procrustes: rmse 0.007743714 max resid 0.03526223
## Run 14 stress 0.01056949
## ... Procrustes: rmse 0.005722825 max resid 0.03471382
## Run 15 stress 0.011275
## Run 16 stress 0.01036087
## ... Procrustes: rmse 0.006974566 max resid 0.03579184
## Run 17 stress 0.01091316
## Run 18 stress 0.01033003
## ... Procrustes: rmse 0.002771702 max resid 0.01198565
## Run 19 stress 0.01034078
## ... Procrustes: rmse 0.007956342 max resid 0.03661587
## Run 20 stress 0.01068852
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.009304755
## Run 1 stress 0.009079751
## ... New best solution
## ... Procrustes: rmse 0.007392154 max resid 0.03508684
## Run 2 stress 0.009921875
## Run 3 stress 0.00955019
## ... Procrustes: rmse 0.009204975 max resid 0.03120943
## Run 4 stress 0.009332197
## ... Procrustes: rmse 0.009481208 max resid 0.0408877
## Run 5 stress 0.009250819
## ... Procrustes: rmse 0.007829802 max resid 0.03912463
## Run 6 stress 0.009386853
## ... Procrustes: rmse 0.008065654 max resid 0.0328852
## Run 7 stress 0.009799591
## Run 8 stress 0.009419871
## ... Procrustes: rmse 0.006449455 max resid 0.03260375
## Run 9 stress 0.00934903
## ... Procrustes: rmse 0.007089081 max resid 0.02737301
## Run 10 stress 0.009430424
## ... Procrustes: rmse 0.006765107 max resid 0.02908023
## Run 11 stress 0.009313857
## ... Procrustes: rmse 0.005841602 max resid 0.02412507
## Run 12 stress 0.009583354
## Run 13 stress 0.009496379
## ... Procrustes: rmse 0.009679609 max resid 0.03651898
## Run 14 stress 0.009162148
## ... Procrustes: rmse 0.006400849 max resid 0.03590499
## Run 15 stress 0.009294992
## ... Procrustes: rmse 0.008199929 max resid 0.03287096
## Run 16 stress 0.009585375
## Run 17 stress 0.009728722
## Run 18 stress 0.009263578
## ... Procrustes: rmse 0.005990218 max resid 0.02742862
## Run 19 stress 0.009521784
## ... Procrustes: rmse 0.009676745 max resid 0.05685618
## Run 20 stress 0.009445445
## ... Procrustes: rmse 0.009117847 max resid 0.04207846
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.008295943
## Run 1 stress 0.008496419
## ... Procrustes: rmse 0.008619613 max resid 0.03715397
## Run 2 stress 0.008537797
## ... Procrustes: rmse 0.007566519 max resid 0.02848983
## Run 3 stress 0.008628008
## ... Procrustes: rmse 0.008139803 max resid 0.04384642
## Run 4 stress 0.008792971
## ... Procrustes: rmse 0.009290983 max resid 0.04193234
## Run 5 stress 0.008617083
## ... Procrustes: rmse 0.008562764 max resid 0.03362268
## Run 6 stress 0.008480338
## ... Procrustes: rmse 0.008265676 max resid 0.05355241
## Run 7 stress 0.008857265
## Run 8 stress 0.008469053
## ... Procrustes: rmse 0.008309353 max resid 0.03706517
## Run 9 stress 0.008831019
## Run 10 stress 0.008553349
## ... Procrustes: rmse 0.007611489 max resid 0.03850816
## Run 11 stress 0.009247895
## Run 12 stress 0.008854818
## Run 13 stress 0.00877604
## ... Procrustes: rmse 0.009116165 max resid 0.03244983
## Run 14 stress 0.008806564
## Run 15 stress 0.008922373
## Run 16 stress 0.009056723
## Run 17 stress 0.008788702
## ... Procrustes: rmse 0.008584848 max resid 0.03447244
## Run 18 stress 0.008418406
## ... Procrustes: rmse 0.005876674 max resid 0.03070729
## Run 19 stress 0.008749196
## ... Procrustes: rmse 0.009382408 max resid 0.04115441
## Run 20 stress 0.008793837
## ... Procrustes: rmse 0.008845845 max resid 0.03593101
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.007479789
## Run 1 stress 0.008208023
## Run 2 stress 0.007856875
## ... Procrustes: rmse 0.007095704 max resid 0.02434997
## Run 3 stress 0.00785574
## ... Procrustes: rmse 0.009063817 max resid 0.03896258
## Run 4 stress 0.008391099
## Run 5 stress 0.008322596
## Run 6 stress 0.007720381
## ... Procrustes: rmse 0.007367781 max resid 0.02736628
## Run 7 stress 0.008697323
## Run 8 stress 0.007910849
## ... Procrustes: rmse 0.007780136 max resid 0.02595165
## Run 9 stress 0.008117482
## Run 10 stress 0.008174521
## Run 11 stress 0.008043871
## Run 12 stress 0.008095074
## Run 13 stress 0.008055021
## Run 14 stress 0.007832844
## ... Procrustes: rmse 0.005869936 max resid 0.01978361
## Run 15 stress 0.008508935
## Run 16 stress 0.008054082
## Run 17 stress 0.008658626
## Run 18 stress 0.008162483
## Run 19 stress 0.008158718
## Run 20 stress 0.008161276
## *** Best solution was not repeated -- monoMDS stopping criteria:
## 20: no. of iterations >= maxit
plot(scree_out)
# k = 3 looks great
try.NMDS <- metaMDS(abundance_mat_REMOVEextinct, distance = "bray", k = 3, autotransform = TRUE, trymax=100)
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.03497605
## Run 1 stress 0.03525711
## ... Procrustes: rmse 0.005585306 max resid 0.03614704
## Run 2 stress 0.03688201
## Run 3 stress 0.03554516
## Run 4 stress 0.03525554
## ... Procrustes: rmse 0.005561097 max resid 0.0357753
## Run 5 stress 0.0351887
## ... Procrustes: rmse 0.01502893 max resid 0.04291864
## Run 6 stress 0.03518822
## ... Procrustes: rmse 0.01606609 max resid 0.05451377
## Run 7 stress 0.03518879
## ... Procrustes: rmse 0.01607117 max resid 0.05449844
## Run 8 stress 0.0354238
## ... Procrustes: rmse 0.005159167 max resid 0.03525606
## Run 9 stress 0.03517627
## ... Procrustes: rmse 0.01603024 max resid 0.05452777
## Run 10 stress 0.03503206
## ... Procrustes: rmse 0.01504669 max resid 0.04255321
## Run 11 stress 0.03497598
## ... New best solution
## ... Procrustes: rmse 5.36466e-05 max resid 0.00057828
## ... Similar to previous best
## Run 12 stress 0.03497638
## ... Procrustes: rmse 0.0002917228 max resid 0.00421389
## ... Similar to previous best
## Run 13 stress 0.0351753
## ... Procrustes: rmse 0.01602433 max resid 0.05453874
## Run 14 stress 0.03517541
## ... Procrustes: rmse 0.01602501 max resid 0.05453285
## Run 15 stress 0.03518904
## ... Procrustes: rmse 0.01502612 max resid 0.04345275
## Run 16 stress 0.03525552
## ... Procrustes: rmse 0.00556468 max resid 0.03587325
## Run 17 stress 0.0349758
## ... New best solution
## ... Procrustes: rmse 6.080233e-05 max resid 0.0006547625
## ... Similar to previous best
## Run 18 stress 0.03517602
## ... Procrustes: rmse 0.01603536 max resid 0.05454452
## Run 19 stress 0.03525035
## ... Procrustes: rmse 0.005623592 max resid 0.03641809
## Run 20 stress 0.03517558
## ... Procrustes: rmse 0.01603239 max resid 0.05454683
## *** Best solution repeated 1 times
# plot the results for axis 1 & 2
ordiplot(try.NMDS, type = "n") # create blank ordination plot
orditorp(try.NMDS, display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 1 & 3
ordiplot(try.NMDS, choices = c(1,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(1,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(1,3), display = "species", col="red", air = 0.1) # add species names in red
# plot the results for axis 2 & 3
ordiplot(try.NMDS, choices = c(2,3), type = "n") # create blank ordination plot
orditorp(try.NMDS, choices = c(2,3), display = "sites", cex = 0.5, air = 0.1) # add row numbers in black
orditorp(try.NMDS, choices = c(2,3), display = "species", col="red", air = 0.1) # add species names in red
# okay so these look fairly similar to the 2nd NMDS that I did (for all communities as time series)
# ... in other words, rather confusing because the heat effect is hidden under the protegens effect
# let's go straight to plotting the protegens x heat effect using ggplot
nmds_for_ggplot <- cbind(absDen_wide_forOrd_REMOVEextinct[,1:8],
as.data.frame(scores(try.NMDS)$sites))
# create a new factor that defines the combination of heat and protegens
nmds_for_ggplot <- nmds_for_ggplot %>% unite("HeatxProtegens", c(Heat, protegens), remove = FALSE)
nmds_for_ggplot$HeatxProtegens <- factor(nmds_for_ggplot$HeatxProtegens,
levels = c("6_0", "6_1", "12_0", "12_1", "24_0", "24_1", "48_0", "48_1", "control_0", "control_1"))
# create empty dataframes to combine NMDS data with ellipse data
ellipse12_df <- ellipse13_df <- ellipse23_df <- data.frame() # numbers indicate the ordination axes
# adding data for ellipses, using HeatxProtegens as a grouping factor
for(g in levels(nmds_for_ggplot$HeatxProtegens)){
ellipse12_df <- rbind(ellipse12_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegens==g,],
veganCovEllipse(cov.wt(cbind(NMDS1, NMDS2),
wt=rep(1/length(NMDS1),length(NMDS1)))$cov,
center=c(mean(NMDS1),mean(NMDS2)))))
, HeatxProtegens=g))
ellipse13_df <- rbind(ellipse13_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegens==g,],
veganCovEllipse(cov.wt(cbind(NMDS1, NMDS3),
wt=rep(1/length(NMDS1),length(NMDS1)))$cov,
center=c(mean(NMDS1),mean(NMDS3)))))
, HeatxProtegens=g))
ellipse23_df <- rbind(ellipse23_df, cbind(as.data.frame(with(nmds_for_ggplot[nmds_for_ggplot$HeatxProtegens==g,],
veganCovEllipse(cov.wt(cbind(NMDS2, NMDS3),
wt=rep(1/length(NMDS2),length(NMDS2)))$cov,
center=c(mean(NMDS2),mean(NMDS3)))))
, HeatxProtegens=g))
}
# now we separate the HeatxProtegens columns:
ellipse12_df <- ellipse12_df %>% separate(HeatxProtegens, c("Heat", "protegens"))
ellipse12_df$Heat <- factor(ellipse12_df$Heat, levels = levels(nmds_for_ggplot$Heat))
ellipse13_df <- ellipse13_df %>% separate(HeatxProtegens, c("Heat", "protegens"))
ellipse13_df$Heat <- factor(ellipse13_df$Heat, levels = levels(nmds_for_ggplot$Heat))
ellipse23_df <- ellipse23_df %>% separate(HeatxProtegens, c("Heat", "protegens"))
ellipse23_df$Heat <- factor(ellipse23_df$Heat, levels = levels(nmds_for_ggplot$Heat))
nmds_for_ggplot$protegens <- as.character(nmds_for_ggplot$protegens) # this needs to be discrete (could also be a factor)
# and finally we can make the plots:
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS2)) +
geom_point(aes(color = Heat, shape = protegens), alpha=0.4) + # adding different colours and shapes for points at different distances
geom_path(data=ellipse12_df, aes(x=NMDS1, y=NMDS2, colour=Heat, linetype=protegens), linewidth=1) + # adding covariance ellipses according to distance # use size argument if ggplot2 < v. 3.4.0
guides(color = guide_legend(override.aes = list(linetype=rep(NA,5)))) + # removes lines from colour part of the legend
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme + # not sure why I need this but I do to over-write the default grey theme
labs(title="NMDS without extinctions (4sp & 3 time-points)")
# axes 1 & 2 again showing just the ellipses (bc it's hard to see protegens effects as it's so overlapped)
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS2)) +
geom_path(data=ellipse12_df, aes(x=NMDS1, y=NMDS2, colour=Heat, linetype=protegens), linewidth=1) + # plot just the ellipses
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme + # not sure why I need this but I do to over-write the default grey theme
labs(title="NMDS without extinctions (4sp & 3 time-points)")
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS3)) +
geom_point(aes(color = Heat, shape = protegens), alpha=0.4) +
geom_path(data=ellipse13_df, aes(x=NMDS1, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
guides(color = guide_legend(override.aes = list(linetype=rep(NA,5)))) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS without extinctions (4sp & 3 time-points)")
# axes 1 & 3 again showing just the ellipses
ggplot(data = nmds_for_ggplot, aes(NMDS1, NMDS3)) +
geom_path(data=ellipse13_df, aes(x=NMDS1, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS without extinctions (4sp & 3 time-points)")
ggplot(data = nmds_for_ggplot, aes(NMDS2, NMDS3)) +
geom_point(aes(color = Heat, shape = protegens), alpha=0.4) +
geom_path(data=ellipse23_df, aes(x=NMDS2, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
guides(color = guide_legend(override.aes = list(linetype=rep(NA,5)))) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS without extinctions (4sp & 3 time-points)")
# axes 2 & 3 again showing just the ellipses (bc it's hard to see protegens effects as it's so overlapped)
ggplot(data = nmds_for_ggplot, aes(NMDS2, NMDS3)) +
geom_path(data=ellipse23_df, aes(x=NMDS2, y=NMDS3, colour=Heat, linetype=protegens), linewidth=1) +
scale_colour_viridis_d(option = "plasma", begin=0.05, end = 0.9) +
fave_theme +
labs(title="NMDS without extinctions (4sp & 3 time-points)")
# let's see if we can detect a heat gradient now that the 48h seems somewhat better behaved
test <- absDen_wide_forOrd_REMOVEextinct
test$Heat <- as.character(levels(test$Heat))[test$Heat]
test$Heat[test$Heat == "control"] <- 0
test$Heat <- as.numeric(test$Heat)
# for proper labelling of the arrow, need to supply env as a data.frame
temp <- as.data.frame(test$Heat)
colnames(temp) <- "Heat"
## Amazing, the gradient is finally significant!
gg_envfit(try.NMDS, env = temp, groups = as.factor(absDen_wide_forOrd_REMOVEextinct$protegens), plot = TRUE, alpha=0.01)
gg_envfit(try.NMDS, env = temp, groups = as.factor(absDen_wide_forOrd_REMOVEextinct$protegens), plot = TRUE, alpha=0.01, choices=c(1,3))
gg_envfit(try.NMDS, env = temp, groups = as.factor(absDen_wide_forOrd_REMOVEextinct$protegens), plot = TRUE, alpha=0.01, choices=c(2,3))
## try plotting it as a surface contour:
ordisurf(try.NMDS ~ temp$Heat)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 7.48 total = 8.48
##
## REML score: 1245.315
ordisurf(try.NMDS ~ temp$Heat, choices = c(1,3))
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 8.77 total = 9.77
##
## REML score: 1076.068
ordisurf(try.NMDS ~ temp$Heat, choices = c(2,3))
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 8.71 total = 9.71
##
## REML score: 1067.971
###############
# check significance using a PERMANOVA to test the differences in community composition
nmdsdata_test_Heat <- adonis2(abundance_mat_REMOVEextinct ~ Heat, absDen_wide_forOrd_REMOVEextinct,
permutations = 999, method = "bray")
print(nmdsdata_test_Heat)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_mat_REMOVEextinct ~ Heat, data = absDen_wide_forOrd_REMOVEextinct, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 4 7.610 0.0791 6.4422 0.001 ***
## Residual 300 88.598 0.9209
## Total 304 96.208 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
nmdsdata_test_Prot <- adonis2(abundance_mat_REMOVEextinct ~ protegens, absDen_wide_forOrd_REMOVEextinct,
permutations = 999, method = "bray")
print(nmdsdata_test_Prot)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_mat_REMOVEextinct ~ protegens, data = absDen_wide_forOrd_REMOVEextinct, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 1 44.695 0.46456 262.89 0.001 ***
## Residual 303 51.514 0.53544
## Total 304 96.208 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
nmdsdata_test_HeatxProt <- adonis2(abundance_mat_REMOVEextinct ~ Heat * protegens, absDen_wide_forOrd_REMOVEextinct,
permutations = 999, method = "bray")
print(nmdsdata_test_HeatxProt)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = abundance_mat_REMOVEextinct ~ Heat * protegens, data = absDen_wide_forOrd_REMOVEextinct, permutations = 999, method = "bray")
## Df SumOfSqs R2 F Pr(>F)
## Model 9 58.410 0.60712 50.651 0.001 ***
## Residual 295 37.799 0.39288
## Total 304 96.208 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# so these are all significant but is that spurious because the dispersion is different btw groups? (e.g., much smaller for protegens)
##############
# check PERMANOVA assumption of homogeneous group variances
# Bray-curtis distance matrix
dist_mat <- vegdist(abundance_mat_REMOVEextinct, method = "bray")
# use betadisper test to check for multivariate homogeneity of group variances
dispersion <- betadisper(dist_mat, group = paste(absDen_wide_forOrd_REMOVEextinct$Heat, absDen_wide_forOrd_REMOVEextinct$protegens))
## Warning in betadisper(dist_mat, group =
## paste(absDen_wide_forOrd_REMOVEextinct$Heat, : some squared distances are
## negative and changed to zero
permutest(dispersion)
##
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
##
## Response: Distances
## Df Sum Sq Mean Sq F N.Perm Pr(>F)
## Groups 9 6.2898 0.69887 15.375 999 0.001 ***
## Residuals 295 13.4095 0.04546
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# yeap! We need to try a different test that is robust to heterogenous group variances...
################
# check significance again using ANOSIM (which is another non-parametric test but this time only considering the ranks)
nmdsdata_test2_HeatxProt <- anosim(dist_mat,
grouping = paste(absDen_wide_forOrd_REMOVEextinct$Heat, absDen_wide_forOrd_REMOVEextinct$protegens),
permutations = 999)
plot(nmdsdata_test2_HeatxProt)
## Warning in (function (z, notch = FALSE, width = NULL, varwidth = FALSE, : some
## notches went outside hinges ('box'): maybe set notch=FALSE
summary(nmdsdata_test2_HeatxProt)
##
## Call:
## anosim(x = dist_mat, grouping = paste(absDen_wide_forOrd_REMOVEextinct$Heat, absDen_wide_forOrd_REMOVEextinct$protegens), permutations = 999)
## Dissimilarity: bray
##
## ANOSIM statistic R: 0.6508
## Significance: 0.001
##
## Permutation: free
## Number of permutations: 999
##
## Upper quantiles of permutations (null model):
## 90% 95% 97.5% 99%
## 0.0140 0.0184 0.0223 0.0286
##
## Dissimilarity ranks between and within classes:
## 0% 25% 50% 75% 100% N
## Between 17 14076.25 25221.5 39745.00 39745 41498
## 12 0 126 4811.00 22025.0 24737.00 39745 253
## 12 1 7 2043.25 6605.0 10474.00 18871 780
## 24 0 1 5311.50 21702.5 32177.25 39745 406
## 24 1 72 3022.00 5716.0 9690.50 17508 595
## 48 0 20 2278.00 7348.0 14367.00 19847 66
## 48 1 3 1275.00 3424.0 8098.50 20639 595
## 6 0 58 6313.00 21240.0 39745.00 39745 561
## 6 1 5 1520.50 3719.0 6303.50 19591 780
## control 0 655 8775.00 20054.0 20867.50 39745 231
## control 1 2 699.00 1826.0 4169.50 17228 595
# clean up
rm(absDen_wide_forOrd_REMOVEextinct, abundance_mat_REMOVEextinct, scree_out, try.NMDS, nmds_for_ggplot, ellipse12_df, ellipse13_df, ellipse23_df, test, temp, nmdsdata_test_Heat, nmdsdata_test_Prot, nmdsdata_test_HeatxProt, dist_mat, dispersion, nmdsdata_test2_HeatxProt)
So if we remove the extinctions from the data, then we find that increasing heat duration pushes the communities in a relatively consistent direction. As we can see from the contour plot, there’s still a strong effect of P. protegens presence but the net effect is at least consistent now between presence/absence of protegens.
Unfortunately, it’s still not possible to summarize how the communities change over time (e.g., whether resistance or recovery look more similar to one another across heat treatments)…
One problem with including all 4 species in the ordination analysis is that the presence of different species is most of the variation but we don’t really care about the different species per se. But, we are far more interested in understanding what happens between different time points across all species. So I think it would be better to focus on each species separately so that we can see if there’s a common impact of time… \(\leftarrow\) the problem here is that I think the plot would be most informative when time is treated as an environmental variable. And if we do that as well as dropping all other species, then you’re left with just 1 column in the ordination plot…